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The  annual  review  meeting  for  research  and  government 
personnel  involved  in  the  DARPA  program  on  Knowledge-Based 
Systems  was  held  in  St.  Louis,  Missouri  on  21-23  April  1987. 
The  purpose  of  the  meeting  was  to  review  progress  on  research 
efforts  undertaken  over  the  past  year.  Research  organizations 
participating  in  the  workshop  included  General  Electric, 
Stanford  University,  Ohio  State  University,  University  of 
Massachusetts,  Teknowledge  Inc.,  Intellicorp,  and  USC 
Information  Sciences  Institute.  Also  present  were 
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Semantically  Sound  Inheritance 
for  a  Formally  Defined  Frame  Language 
with  Defaults 


Robert  Nado  and  Richard  Fikes 


IntelliCorp 

1975  El  Camino  Real  West 
Mountain  View,  California  94040-2216 


Abstract 

Most  frame  languages  either  are  glaringly  deficient 
;n  their  treatment  of  default  information  or  do  not 
represent  it  at  all.  This  paper  presents  a  formal 
description  of  a  frame  language  that  provides 
semantieally  sound  facilities  for  representing  default 
information  and  an  efficient  serial  algorithm  for 
inheriting  default  information  down  class-subclass  and 
elass-member  hierarchies  eonstrueted  in  that  language. 
We  present  the  inheritance  algorithm  in  two  forms.  In 
the  first  form,  the  algorithm  provides  justifications  to  a 
TMS,  which  then  manages  the  inherited  information.  In 
the  second  form,  the  algorithm  performs  its  own,  special- 
purpose  truth  maintenance  and  therefore  is  useable  in  a 
system  that  does  not  include  a  general-purpose  TMS.1 


I.  Introduction 

The  eommon-sense  reasoning  required  in  many  knowledge  system 
applications  relies  heavily  on  the  ability  to  use  general  information 
that  is  subject  to  exceptions:  what  has  been  ealled  prototypie  or 
default  information.  Although  frame-based  representation 
languages  have  become  increasingly  popular  for  expressing  the 
domain-speeifie  information  on  which  the  functionality  of 
knowledge  systems  is  based  [Fikes  and  Kehler,  1985],  most  such 
languages  either  are  glaringly  deficient  in  their  treatment  of  default 
information  (as  argued,  for  example,  in  [Braclunan,  1985] 
and  [Touretzky,  1984])  or  do  not  represent  it  at  all  (e.g., 
KL-ONE  [Brachman  and  Sclunolze,  1985]  and 

KRYPTON  [Braehman  el  at.,  1983]).  Thus,  an  important  step  in 
the  advancement  of  knowledge  system  technology  is  the 
development  of  a  frame  language  that  provides  semantieally  sound 
facilities  for  representing  and  efficiently  processing  default 
information.  This  paper  presents  a  formal  description  of  sucli^  a 
frame  language  (based  on  the  frame  language  in  the  KEF, 
system2)  and  an  efficient  serial  algorithm  for  inheriting  default 
information  down  class-subclass  and  class-member  hierarchies 
constructed  in  that  language.  The  language  has  been  implemented 
at  IntelliCorp  in  a  system  called  OPUS. 

As  observed  by  Touretzky  [Touretzky,  1986],  the  “shortest 


'This  research  was  supported  in  part  by  the  Defense  Advanced  Research  Projects 
Agency  (DARPA)  under  contract  No.  F30002  85  C  00G5.  The  views  and 
conclusions  reported  here  are  those  of  the  authors  and  should  not  be  construed  as 
representing  the  official  position  or  policy  of  DARPA  or  the  U.S.  government. 

2KEEworlds,  KEE  and  Knowledge  Engineering  Environment  are  trademarks  of 
IntelliCorp. 


path”  ordering  of  defaults  used  by  most  inheritance  systems  (e.g., 
FRL  [Roberts  and  Goldstein,  1977]  and  NETL  [Fahlman,  1979]), 
does  not  always  successfully  provide  the  desired  preference  of  more 
specific  defaults  over  less  specific  defaults.  Problems  arise  in  some 
eases  of  multiple  inheritance,  where  nodes  are  allowed  to  have  more 
than  one  parent  link.  An  example,  adapted  from  Touretzky,  is 
depicted  in  Figure  1.  The  typical  inheritance  algorithm  correctly 
prefers  White  over  Grey  as  a  default  eolor  for  a  royal  elephant, 
because  the  default  from  RoyalElephants  has  a  "shorter  path"  than 
the  default  from  Elephants.  However,  in  the  situation  shown  in  the 
figure,  Clyde  has  a  redundant  class  membership  link  to  Elephants. 
Clyde,  then,  inherits  hoth  the  default  White  from  RoyalElephants 
and  the  default  Grey  from  Elephants  along  paths  of  equal  length. 
Thus,  shortest-path  algorithms  are  not  sufficient  to  eorrectly 
handle  this  situation.3  This,  and  other  shortcomings  of  existing 
algorithms  are  overcome  in  the  OPUS  algorithm  presented  here. 

An  additional  motivation  for  this  work  is  to  enable  “truth 
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Figure  1:  A  Problem  with  the  "Shortest  Patlr*  Ordering 


maintenance"  (or,  "reason  maintenance"  as  it  is  sometimes  called) 
capabilities  to  he  incorporated  into  frame-based  representation 
systems.  Truth  maintenance  algorithms  provide  an  automatic 
means  of  managing  derived  results  ns  changes  are  made  in  a 
model  [Dovle,  1979].  In  addition,  a  truth  maintenance  system 
(TMS)  ca  j  he  used  as  the  basis  for  a  context  mechanism  that 
enables  a  frame  system  to  model  and  compare  multiple 
hypothetical  situations  (as  was  done,  for  example,  in  the 
KEEworlds™  facility  [Morris  and  Nado.  1986]). 

Inheritance  mechanisms  add  derived  results  to  a  model.  1  hey 
also  typically  provide  an  efficient  special-purpose  form  of  truth 


3The  same  problem  is  obtained  if  <*qual  numbers  nF  inter  mediate  subclasses  are 
added  along  the  two  paths  from  Clyde  to  hlephaiits. 
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f  r*  * .  *  *.  *  * 


maintenance  for  those  results  in  that  they  remove  information  they 
have  derived  when  a  change  occurs  in  the  form  or  content  of  the 
hierarchies  on  which  those  derivations  are  based.  If  a  general- 
purpose  TMS  has  been  incorporated  into  a  frame  system,  then  the 
TMS  ean  be  used  to  maintain  the  inherited  information,  thereby 
significantly  reducing  the  complexity  of  the  inheritance  mechanism. 
However,  such  a  reduction  ean  be  obtained  only  if  the  derivations 
performed  during  inheritance  are  expressible  in  the  logical 
formalism  supported  by  the  TMS. 

The  inheritance  algorithm  in  the  current  KEE  system  (and  in 
other  similar  systems)  is  unsuitable  for  providing  such  justifications 
because  it  depends  on  arbitrary  LISP  procedures  to  perform  its 
deductions  and  allows  those  procedures  to  use  information  whose 
semantic  interpretation  is  unclear  such  as  the  order  in  which 
inheritance  links  are  stored.  The  OPUS  inheritance  algorithm  we 
present  here  performs  sound  deductions  deseribable  to  a  TMS  in 
the  form  of  nonmonotonic  justifications  whose  justifiers  are 
propositions  expressible  in  the  frame  language.  OPUS,  therefore,  in 
combination  with  the  KEEworlds  system,  performs  context-relative 
inheritance. 

After  presenting  the  formal  description  of  the  frame  language, 
we  present  the  OPUS  inheritance  algorithm  in  two  forms.  In  the 
first  form,  the  algorithm  provides  justifications  to  a  TMS,  which 
then  manages  the  inherited  information.  In  the  second  form,  the 
algorithm  performs  its  own  truth  maintenance  and  therefore  is 
useable  in  a  system  that  does  not  include  a  TMS. 

II.  A  Frame  Language  with 
Defaults  and  Exceptions 

The  formal  description  wc  present  for  the  OPUS  frame  language  is 
based  on  the  formalism  developed  by  Ethcrington  for  default  and 
exception  links  in  inheritance  networks  [Etherington,  1987].  We 
found  it  desirable  to  extend  Etherington’s  formalism  in  several 
ways  to  accommodate  the  structure  of  a  frame  language,  to  include 
more  powerful  constructs  for  describing  exceptions  to  defaults,  and 
to  provide  for  the  overriding  of  defaults  at  superclasses  by  defaults 
at  subclasses.  Those  extensions  are  noted  in  our  presentation  and 
the  motivations  for  them  discussed. 

A.  The  Language  To  Which  Defaults  Were  Added 

We  begin  by  providing  a  set-theoretic  description  of  the  OPUS 
frame  language  before  defaults  and  exceptions  were  added. 

1,  Frames 

A  frame  represents  an  entity  in  the  domain  of  discourse. 
Formally,  a  frame  corresponds  to  a  logical  constant.  A  frame 
includes  a  collection  of  own  slots  that  describe  binary  relationships 
considered  to  hold  between  the  entity  represented  by  the  frame  and 
other  entities  in  the  domain.  A  frame's  collection  of  own  slots 
neeessarily  includes  MeinherOf,  which  represents  the  standard  set 
(i.e. ,  class)  membership  predicate  from  set  theory. 

2.  Class  Frames 

A  class  frame  is  a  frame  that  represents  a  collection  (i.e., 
class)  of  entities  in  the  domain  of  discourse.  Such  a  class  is  itself 
considered  to  be  an  entity  in  the  domain  of  discourse.  Thus,  a 
class  frame  has  associated  with  it  a  collection  of  own  slots 
describing  the  binary  relationships  that  t.he  class  has  with  other 
entities.  Those  own  slots  include  Subclass,  SubclassOf,  Member, 
and  MemberOf,  wh.ch  represent  the  standard  subset  and  set 
membership  predicates  from  set  theory.  These  slots  provide  the 
“links'1  over  which  inheritance  is  done.  In  addition,  a  class  frame 
has  associated  with  it  a  collection  of  prototype  slots  that  describe 
binary  relationships  considered  to  hold  between  each  member  of  the 
class  represented  by  the  frame  and  other  entities  in  the  domain 


3.  Own  Slots 

An  own  slot  has  associated  with  it  a  collection  of  values,  each 
of  which  represents  an  entity  in  the  domain  of  discourse.  Formally 
an  own  slot  named  5  has  associated  with  it  a  binary  predicate, 
which  for  convenience  we  will  also  call  5.  An  own  slot  S  in  a  frame 
F  having  value  V corresponds  to  the  assertion  S(F,V). 

4.  Prototype  Slots 

A  prototype  slot  has  associated  with,  it  a  eolleetion  of 
necessary  values,  each  of  which  represents  an  entity  in  the  domain 
of  discourse.  Formally,  a  prototype  slot  S  has  associated  with  it  a 
binary  predicate  NeeS.  A  prototype  slot  S  in  a  class  frame  C 
having  necessary  value  V  corresponds  to  the  assertion  NecSfC,VJ. 
Predicate  NecS  is  related  to  predicate  5  by  the  following 
definition:4 

NecS(C,  V)  =  Vx  [MemberOf(i,(7)  3  S(x,  V)] 

The  following  theorem  follows  from  this  definition  and  the  set 
theory  definition  of  SubclassOf  in  terms  of  MemberOf: 

NecS(C,  V)  A  SubclassOffx,  C)  3  NecS(x,  V) 

That  is,  necessary  values  of  a  prototype  slot  at  a  class  frame 
representing  a  class  C  are  also  necessary  values  of  the  prototype 
slot  at  all  class  frames  representing  subsets  of  C.  The  OPUS 
inheritance  algorithm  performs  the  deductions  implied  by  the 
definition  of  NecS  and  by  the  theorem  by  propagating  necessary 
values  of  prototype  slots  to  all  subclasses  and  class  members. 

The  OPUS  frame  language  without  defaults  can  be 
characterized  as  expressing  statements  of  the  form  S(x, y)  and 
Nec S(x,y)  for  arbitrary  first  order  binary  predicates  S.  The 
language  does  not  recurse  in  that  it  does  not  represent  predicates  of 
the  form  NccNecS. 

B.  Adding  Defaults  and  Exceptions 

Our  goal  was  to  augment  the  frame  language  described  above  to 
enable  class  frames  to  include  prototypical  descriptions  of  elass 
members.  That  is,  we  wanted  to  enable  prototype  slots  to  have 
default  values  that  would  be  inherited  to  class  members  as 
assumed  values  for  the  corresponding  own  slots  unless  blocked  by 
exceptions. 

We  began  by  attempting  to  directly  implement  the  formalism 
for  defaults  with  exceptions  in  inheritance  networks  described  by 
Etherington  [Etherington,  1987],  Etherington’s  formalism  is  stated 
entirely  in  terms  of  unary  elass  membership  predicates.  That  is,  he 
t-  ats  each  class  C  as  a  unary  predicate,  C(x),  that  is  true  when  x 
is  a  member  of  C.  He  defines  a  "Membership"  link  between  an 
object  a  and  a  class  C  to  mean  n  belongs  to  elass  C  (i.e.,  O(a)). 
The  OPUS  MemberOf  own  slot  corresponds  to  the  membership 
link.  He  defines  a  "Strict  1S-A"  link  between  class  Cl  and  class  C2 
to  mean  Cl’s  are  always  CS's  (i.e.,  Vx  [Cl(x)  3  C2(x)]).  The  OPUS 
SubclassOf  own  slot  corresponds  to  the  strict  1S-A  link. 

Own  slots  are  treated  in  Etherington’s  formalism  by 
considering  each  slot-value  pair  (S,V)  to  be  a  unary  predicate, 
S\'(x),  corresponding  to  the  class  of  all  objects  having  value  V  for 
own  slot  S  (e.g.,  the  class  of  objects  having  color  grey).  Given  that 
formalism  for  own  slots,  a  necessary  value  U  of  a  prototype  slot  5 
in  a  class  frame  C  is  a  strict  1S-A  link  between  C  and  SV. 

Etherington  represents  default  information  in  his  inheritance 
networks  by  "Default  1S-A"  and  "Exception"  links.  A  default  1S-A 
link  from  class  Cl  to  class  02  means  "Normally,  Cl's  are  CS’s", 
and  is  expressed  formally  by  the  default  logic  infci  nice  rule: 

Cl(x)  ;  C2(x) 

C2(I) 


Here  and  in  the  rest  of  the  pa  free  variables  are  implicitly  universally 
quantified. 


The  interpretation  of  this  rule  is:  if  Cl(x)  (called  the 
prerequisite)  is  known  and  CS(x)  (called  the  justification  where  it 
appears  above  the  line)  is  consistent  with  what  is  known,  then 
C2(x)  (called  the  consequent  where  it  appears  below  the  line)  may 
be  concluded. 

An  exception  link  has  a  class  at  its  tail  and  a  defa*  »  1S-A  link 
at  its  head.  An  exception  link  from  class  Cl  to  a  default  IS-A  link 
from  CS  to  CS  means  "Cl's  are  exceptions  to  OS's  being  CS's" 

(e  g.,  "Royal  elephants  are  exceptions  to  elephants  being  grey"). 
Etherington  provides  no  independent  semantics  for  an  exception 
link  Instead,  he  defines  it  formally  as  a  modification  to  the 
default  rule  corresponding  to  the  link  being  blocked.  However, 
Doyle  has  suggested  (a*  reported  by  Touretzky  [Touretzky,  1 986) ) 
that  if  the  justification  of  the  defau.  tie  corresponding  to  a 
default  IS-A  link  contains  an  addition-  ary  predicate  unify le  to 
that  default,  then  an  exception  link  b,  Xing  ti.H  default  c  .  be 
defined  to  correspond  to  an  assertion  of  the  negation  of  that 
predicate  for  each  member  of  the  class  at  the  tail  of  the  link. 
Following  that  suggestion,  a  default  IS-A  link  from  class  Cl  to 
class  C2  would  correspond  to  the  default  rule: 

Cl(x)  •  C2(x)  A  ->Exc,!ptionToCl(72(i) 

C2M 

and  an  exception  link  from  CS  to  the  default  IS-A  link  from  Cl  to 
CS  would  correspond  to  the  implication: 
Vx  [C3(x)  3  ExceptionToClC2(:r)]. 

To  add  Etherington ’s  default  IS-A  and  exception  links  to  the 
OPUS  frame  language,  we  associate  with  each  prototype  slot  in  a 
class  frame  a  set  of  default  values  and  a  set  of  prototype 

exceptions  and  we  associate  with  each  own  slot  in  a  frame  a  set  of 
own  exceptions.  Defaults  consist  of  pairs  of  values  and  classes, 
where  the  class  in  the  pair  provides  information  to  the  inheritance 
mechanism  indicating  the  class  where  the  default  originated. 
Prototype  and  own  exceptions  also  consist  of  pairs  of  values  and 
classes.  An  own  exception  (V,OC )  for  an  own  slot  S  in  a  frame  F 
blocks  inheritance  to  F  of  default  value  (V,OC )  for  S,  and 
corresponds  to  the  assertion  "F  is  an  exception  to  OC  haring  5  V" 
(e.g.,  “Clyde  is  an  exception  to  elephants  having  color  grey.").  A 
prototype  exception  (V,OC)  for  a  prototype  slot  S  in  a  class  frame 
C  blocks  inheritance  to  C  and  to  all  its  members  and  subclasses  of 
default  value  ( V,OC j  for  S.  Such  a  prototype  exception 
corresponds  to  the  assertion  "Members  of  C  are  exceptions  to  OC 
having  S  V" .  For  example,  “Royal  elephants  are  exceptions  to 
elephants  having  color  grey". 

Formally,  a  default  value  (V,C )  for  a  prototype  slot  S  in  the 
class  frame  C  corresponds  to  the  assertion  DctS(C.V),  a  default 
value  V,OC  for  a  prototype  slot  S  in  a  class  frame  C  representing  a 
subclass  of  the  class  represented  by  class  frame  OC  corresponds  to 
the  assertion  S  u  b  D  e  f  S/C,  V.  O  Cj,  an  own  exception  (V,OC )  for  an 
own  slot  S  in  a  frame  F  corresponds  to  the  assertion 
OwnExc S(F,V,OC),  and  a  prototype  exception  ( V,OCf  for  a 
prototype  slot  S  in  a  class  frame  C  corresponds  to  the  assertion 
ProExcSfO.V.OC/  These  predicates  are  related  by  the  following 
definitions. 

1.  ProExcS 

l’roExcbyC, i'OCj  means  there  is  an  own  exception  at  each 
member  x  of  C  blocking  the  inheritance  of  default  value  V'  from 
class  OC  to  own  slot  S  in  x.  For  a  given  binary  predicate  S, 
ProExcS  is  defined  as  follows: 

ProExcSfC,  V',  OC)  = 

Vx  [MemberOf(x,  C)  3  OwnExcS(x,  V,  OC)] 

As  was  the  case  for  predicate  NecS,  the  definition  of  ProExcA 
implies  that  prototype  exceptions  are  inherited  to  subclasses.  That 
is: 


ProExcS(C,  V,  OC)  A  SubclassOf(x,  C) 

3  ProExcS(x,  V,  OC) 

An  assertion  of  the  form  P roExc.SY ( V.OC'j  corresponds  in 
Etherington’s  formalism  to  an  exception  link  lrom  C  to  a  default 
IS-A  link  from  OC  to  SV.  OwnExcS  statements  are  inferred  from 
ProExcS  statements  and  serve,  following  Doyle’s  suggestion,  to 
block  default  rules  at  appropriate  class  members. 

2.  DefS 

DelS(C,V)  means  that  for  each  member  x  of  C,  if  it  is 
consistent  to  assume  both  that  V  is  a  value  of  own  slot  5  in  x  and 
that  no  own  exception  at  x  blocks  the  inheritance  of  V  for  5  from 
C,  then  it  can  be  inferred  that  Vis  a  value  of  own  slot  S  in  x.  For 
given  binary  predicate  S,  DefS  is  defined  as  follows: 

Dvf?(C,  V) 

MemberOf(x,  C)  :  S(x,V)  A  -’OwnExcSfx,  V,  C) 

S(x,  V) 

Def S(C,V)  corresponds  in  Etherington’s  formalism  to  a  default 
IS-A  link  from  C  to  SV. 

3.  SubDefS 

The  SubDefS  predicate  is  an  extension  to  Etherington’s 
formalism  to  provide  for  the  inheritance  of  defaults  to  prototype 
slots  in  subclasses.  That  is,  the  frame  language  is  designed  so  that 
the  prototype  slots  at  any  given  class  frame  C  have  all  the 
necessary  and  default  values  to  be  inherited  by  members  of  C  that 
have  been  asserted  at  C  or  at  any  of  C”s  superclasses.  For  example, 
the  class  frame  AfricanElephants  inherits  from  class  frame 
Elephants  the  default  value  grey  for  the  color  prototype  slot. 
Etherington  has  nothing  in  his  formalism  corresponding  to  that 
functionality. 

For  a  given  binary  predicate  S,  SubDefS  statements  are 
inferred  from  DefS  statements  by  the  following  axiom  and  default 
rule: 

DefS(C,  V)  3  SubDefS(C,  V,  C) 

SubDcfS(C,V,OC')ASubclassOf(C,OCl)  :  -ProExcS(C, V,OC) 
SubDefSf  C,V,OC) 

Defaults  asserted  at  a  class  a°  I> fS  statements  are  used  to 
infer  SubDefS  statements  at  thv  class  and  .re  inherited  to  all 
subclasses  as  SubDefS  statemtAis. 

C.  Quantified  Exceptions 

Etherington ’s  link  types  and  the  statement  forms  we  have 
introduced  thus  far  for  OI’US  allow  exceptions  to  be  stated  for 
specific  values  from  specific  origin  classes.  In  practice,  however, 
there  is  a  need  to  assert,  collections  of  exception  links.  For 
example,  one  typically  wants  to  state  for  a  given  slot  in  a  given 
class  frame  (say  the  color  slot,  in  RoyalElepiiants)  that  any  default 
value  from  any  superclass  is  to  be  blocked  and  replaced  by  a  given 
default  value.  Such  assertions  would  be  second  order  statements  if 
Etherington ’s  formalism.  We  can  express  them  in  the  Old  S 
formalism  as  first  order  quantified  statements  as  fellows. 

Vr  OwnExcS’fO,  v,  OC) 

Voc  OwnExc S(0, 1 ,  oc) 

Vn.oc  OwnExcSfO,  e ,  oc) 

Vf  ProExcSfC,  t>,  OC) 

Vor  [SubclassOf(C,  oc)  3  ProFxcSfC,  \',  oc)] 

Vu.oc  (SubclaxsOf(C,  oc)  3  ProExcSfC.  v.  oc); 

The  quantification  of  the  origin  class  that  is  supported  for 
prototype  exceptions  is  only  to  superclasses  of  the  class  to  whose 
members  the  exception  applies.  The  restriction  to  superclasses  is 
meant  to  implement  the  intuition  that  defaults  at  subclasses 


override  defaults  at  superclasses.  For  example,  a  default  color  for 
royal  elephants  overrides  a  default  color  for  elephants.  Thus,  we 
do  not  wani  a  quantified  prototype  exception  to  block  defaults 
from  sibling  classes  and  subclasses,  but  only  from  superclasses. 
(Although  note  that  the  unquantified  form  of  ProExcS  blocks 
defaults  from  any  given  class  including  sibling  classes  and 
subclasses.  The  ability  to  block  defaults  from  siblings  may  .. 
useful  in  'hat  it  allows  one  to  express  a  precedence  ordering  of 
defaults  between  classes  even  though  their  subclass-superclass 
relationship  is  unknown  ) 

As  observed  by  Touretzky  [Touretzky,  1984],  the  natural 
partial  ordering  of  defaults  in  inheritance  systems  defined  by  the 
hierarchical  structure  of  the  inheritance  graph  resolves  many 
ambiguities  in  an  intuitive  way.  Touretzky  introduces  an 
“inferential  distance"  measure  that  expresses  the  desired  natural 
ordering  of  defaults  and  uses  that  measure  to  filter  out  extensions 
that  violate  the  ordering  In  OPUS,  that  eifect  is  obtained  by  the 
explicit  quantification  of  exceptions  over  superclasses.  In 
Touretzky’s  formalism,  an  exception  always  blocks  a  specific 
default  value  from  all  superclasses.  Thus,  unlike  ill  OPUS,  he 
cannot  block  all  values  from  superclasses  nor  can  he  block  values 
from  a  given  superclass. 

In  summary,  for  any  first  order  binary  predicate  5,  the  OPUS 
frame  language  represents  statements  of  the  following  form  (with 
their  Etherington  link  equivalents  where  applicable): 


5(0,  V)  0>—  Member—  >5V 
NecS(C,  V)  O-IS.A— >SV 

DefS(C,l)  C>  — Dcf.lS.A—  >SK 

SubDefS(C,  \  ,  OC) 

0wnExc5(0,  V,  OC) 

Vr  OwnExc5(0,  t\  OC) 

Voc  0wnExc5(0,  V)  oc) 

Vr,  oc  0wnExcS(O,  u,  or) 

ProExcS(C.V,OC)  C>  -Exc-  >  (OO-Def.lS  .A— >SV) 
VtProExc  S(C,  v,  OC) 

Vt’  [SubclassOf(C,  oc)  O  ProExcS(C,  V,  oc)] 

Vt>.  oc  ]SubclassOf(C,  oc)  O  ProExc5(C,  v,  oc)] 

The  system  does  not  recurse  in  that  it  does  not  represent 
NccNec5,  DefNecS,  etc. 

Consider  how  this  formalism  would  be  used  to  express  the 
situation  shown  in  Figure  1.  DefColor  statements  would  be  used  at 
Elephants  and  RoyalElephants  to  express  the  two  defaults,  and  a 
quantified  prototype  exception  statement  would  be  used  at 
RoyalElephants  to  block  the  inheritance  of  default  colors  from  all 
superclasses,  as  follows: 

DefColor(Elephants.Grey) 

De£Color(RoyalEleph  ants,  White) 

Vr,  or  SubelassOf(RovalElephants,  oc) 

D  ProExcColor(RoyalElephants,  r,  oc), 

III.  A  "Push"  Inheritance  Algorithm 
for  Defaults  and  Exceptions 

The  OPUS  frame  language  has  been  Implemented  by  modifying  the 
frame  language  in  the  KEE  system.  The  inheritance  mechanism 
implements  the  deductions  defined  by  the  definitions,  axioms,  and 
theorems  given  above  by  "pushing”  necessary  member  slot  values 
when  they  are  asserted  to  subclasses  and  class  members,  and 
pushing  default  member  slot  values  when  they  are  asserted  to 
subclasses  and  class  members  unless  blocked  by  exceptions. 


In  this  section  we  describe  the  algorithm  in  two  forms  one 
assuming  the  availability  of  a  TMS  to  maintain  the  derived  results 
and  the  other  not.  In  both  cases  we  describe  the  information 
associated  with  each  slot  in  the  implementation  and  the  operations 
performed  by  the  algorithm, 

A.  What's  In  A  Slot? 

Each  own  slot  in  a  frame  has  associated  with  it  sets  of  values  and 
own  exceptions.  Own  exceptions  are  ordered  pairs  of  the  form 
(<value  spec>,  < origin  class  spec>),  where  <value  spec>  is 
either  a  value  or  the  reserved  symbol  *,  and  <origin  class  spec>  is 
either  a  class  or  the  reserved  symbol  *  The  *  symbol  matches  any 
origin  class  or  value  and  thereby  correspon  Is  to  quantified  own 
exceptions. 

Each  prototype  slot  in  a  class  frame  has  associated  with  it  sets 
of  necessary  values,  default  values,  and  prototype  exceptions. 
Default  values  are  ordered  pairs  of  the  form  (<valuc>,  <origin 
class>)  and  prototype  exceptions  are  ordered  pairs  of  the  form 
(<value  spec>,  <origin  class  spec>).  The  *  symbol  in  prototype 
exceptions  matches  any  value  or  any  origin  class  that  is  a 
superclass  and  thereby  corresponds  to  the  desired  forms  of 
quantified  prototype  exceptions. 

B.  Inheritance  with  a  TMS 

In  order  to  perform  inheritance  using  a  TMS,  each  value  or 
exception  that  is  considered  for  a  slot  has  an  assertion  (TMS  node) 
associated  with  it.  The  assertion’s  formula  (TMS  datum)  is  as 
described  in  Section  2  for  the  different  types  of  values  and 
exceptions.  A  value  or  exception  is  added  to  a  slot  by  giving  its 
corresponding  assertion  a  suitable  justification,  either  a  primitive 
justification  or  a  justification  recording  some  deduction  external  to 
the  inheritance  system.  A  given  slot  has  a  particular  value  or 
exception  just  in  case  the  TMS  assigns  a  positive  belief  status  to  its 
corresponding  assertion.  Demons  are  associated  with  each  slot  that 
are  triggered  by  the  TMS  when  an  assertion  concerning  the  slot  is 
believed  for  the  first  time.  A  demon  for  a  particular  value  or 
exception  type  is  responsible  for  determining  which  inheritance 
justifications  involving  the  newly  believed  assertion  should  be 
added  to  the  TMS. 

Necessary  values  of  prototype  slots  are  inherited  to  class 
members  as  values  of  own  slots  via  justifications  of  the  following 
form: 

N'ec S(C,  V)  A  MemberOf(A/eml>,  C) 

— *  $(Memb,  \) 

Necessary  values  of  prototype  slots  are  inherited  to  subclasses 
via  justifications  of  the  following  form: 

Nec S(C,  I )  A  SubclassOff Csub,  C) 

— *  Nc cS(Cgtib.  I) 

Prototype  exceptions  are  inherited  from  classes  to  class 
members  via  justifications  of  the  following  form: 

ProExc5(C,  I r,  OC)  A  MemherOf(  A/r mb,  C) 

— *  Own  Ex  cN]  ,\lnnb.  V,  OC) 

Prototype  exceptions  are  inherited  from  classes  to  subclasses 
via  justifications  of  the  following  form: 

ProExc5(C,  \ '  OC)  A  SubclassOf(C.sub,  C) 

—  ProExc5(C.*u/;.  U,  OC) 

Default  values  of  prototype  slots  are  inherited  to  class 
members  as  values  ol  own  slots  via  nonmonotonic  justifications  of 

the  following  form: 

SubDefS(C,  \  .  OC)  A  MemberOfj  Mi  nib,  C)  A 

OUT]OwnExc5f  A/cin6,  V,  OC) 

— •  S[  Mr  mb  l) 

Note  that  there  is  no  OUT  jtistifier  for  ~S(Mrmb,V)  in  these 
justifications  as  the  formal  definition  of  default  values  requires. 
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Such  a  justifier  is  not  needed  since  statements  of  the  form 
~S(Memb,V)  cannot  be  expressed  in  the  frame  language  and  are 
therefore  necessarily  out. 

Default  values  of  prototype  slots  are  inherited  to  subclasses  via 
nonmonotonic  justifications  of  the  following  form: 

SubDef9(C,  V,  OC)  A  SubclassOf(Csu6,  C)  A 
OUT[ProExcS(Csu6,  V,  OCT)] 

-  SubDefSfCeui,  V,  OC) 

As  before,  these  justifications  do  not  need  to  have  an  OUT 
justifier  for  '~~SubDef9f0su6.Vr,00.)  because  statements  of  the  form 
~SubDef5fCeu6,V;OCj  cannot  be  expressed  in  the  frame  language 
and  are  therefore  necessarily  out. 

Quantified  own  exceptions  are  used  to  generate  instantiated 
own  exceptions  as  needed  to  block  the  inheritance  of  default  values 
that  match  the  quantified  form.  The  instantiated  exceptions  are 
produced  via  justifications  of  the  following  forms: 

OwnExc5(F,  *,  OC)  —  OwnExc.SfF,  V,  OC) 

OwnExc S(F,  V,  *)  —  OwnExcS(F,  V,  OC) 

OwnExc S(F,  * ,  *)  — *  OwnExcSfF,  V,  OC) 

Quantified  prototype  exceptions  are  not  inherited.  Instead, 
they  are  used  to  generate  instantiated  prototype  exceptions  as 
needed  to  block  the  inheritance  of  default  values  that  match  the 
quantified  form.  The  instantiated  exceptions  are  produced  via 
justifications  of  the  following  forms: 

ProExc.9(C,  *,OC)  A  SubclassOf(C,  Csuper)  A 
SuhDefS(Csuper,  V,  OC)  — » ProExcSfC,  V,  OC) 

ProExc S(C,  V,  *)  A  SubclassOffC,  Csuper)  A 
SubDef9( Csuper,  V,  OC)  — »  ProExc.S’fC,  V,  OC) 

ProExc.SfC,  *,  *)  A  SubclassOffC,  Csuper)  A 
SubDcfSf  Csuper,  V,  OC)  -  ProExcSfC,  V,  OC) 

1.  Example 

Consider  the  statements  that  would  he  asserted  and  derived  by 
this  inheritance  mechanism  for  the  example  from  Figure  1.  The 
inheritance  of  color  Grey  from  Elephants  to  RoyalElephants  would 
be  done  via  the  following  justification: 

SuhDefColor(Elephants, Grey , Elephants)  A 
SubclassOf(J?ovalElephants, Elephants)  A 
OUT[ProExcColor(RoyalElephants, Grey, Elephants)] 

— >  SubDefColorfRoyalElcphants, Grey  .Elephants) 

The  inheritance  of  color  Grey  from  Elephants  to  Clyde  would 
be  done  via  the  following  justification: 

SubDefColorfElephants, Grey  .Elephants)  A 
MemberOffClyde, Elephants)  A 
OUT[OwnExcColor(  Clyde, Grey, Elephants)] 

— *  Oolor(Clyde,Grey) 

The  generation  of  the  instantia'ed  prototype  exception  for 
Grey  at  RoyalElephants  would  be  done  via  the  following 
justification: 

ProExcColorfRoyalKlephants,*  ,*)  A 
SubrlassOffRoyalElcphnnls.H  lephants)  A 
SubDefColorfElephants, Grey  .Elephants) 

— •  ProExeColor(RoyalElephants,Grey ,  Elephants) 

The  instantiated  prototype  exception  for  Grey  at. 
RoyalElephants  prevents  inheritance  of  Grey  as  a  default  to 
RoyalElephants.  Thus,  no  justification  is  generated  for  inheriting 
Grey  from  RoyalElephants  to  Clyde.  Inheritance  of  the 
instantiated  prototype  exception  for  Grey  at  RoyalElephants  to 
Clyde  would  be  done  via  the  following  justification: 

1  h’oKxcColor  (Rovall  ilephn  nts.Orcy  .Elephants)  A 
Mi  mbcrOff  Clyde.  Royal  Elephants) 

— *  OwnExcColorf  Clyde,  Grey, Elephants) 

That  inherited  exception  would  block  the  inheritance  of  Grey 


to  Clyde. 

The  inheritance  of  color  White  from  RoyalElephants  to  Clyde 
would  be  done  via  the  following  justification: 

SubDefColorfRoyalElephants, White, RoyalElephants)  A 

MemberOffClyde, RoyalElephants)  A 

OU'I  [OwnExcColorfClyde,  White, Roy  ejt.ifphants)] 

-*  ColorfClyde, White) 

Since  there  is  no  exception  at  Clyde  blocking  the  inheritance  of 
White  from  RoyalElephants,  White  will  become  the  color  of  Clyde 

C.  Inheritance  Without  a  TMS 

The  above  inheritance  scheme  relies  on  a  TMS  to  remove  inherited 
values  when  the  assertions  on  which  the  inheritance  was  based  are 
removed.  For  example,  if  the  default  color  for  elephants  is 
removed,  then  the  TMS  will  also  remove  Clyde’s  color  if  it  was  in 
the  model  only  because  of  the  default.  Inheritance  without  the 
services  of  a  TMS  is  considerably  more  complex  since  the 
inheritance  machinery  must,  in  effect,  provide  a  truth  maintenance 
capability  for  inherited  values. 

In  order  to  provide  for  the  removal  of  inherited  values,  the 
OPUS  inheritance  machinery  requires  each  slot  to  have  both  a  local 
and  a  resultant  set  of  values  and  exceptions.  The  local  sets  are 
used  only  by  the  inheritance  algorithm  and  contain  those  values  or 
exceptions  that  are  either  asserted  or  are  determined  by  some 
means  other  than  inheritance.  Resultant  sets  contain  all  the  values 
and  exceptions,  including  the  local  ones  and  those  derived  by 
inheritance,  When  a  value  or  exception  is  to  be  added  to  (or 
removed  from)  a  slot,  it  is  added  to  (or  removed  from)  the 
appropriate  local  set  and  the  inheritance  machinery  recomputes  the 
affected  resultant  sets  for  that  slot.  When  the  values  of  the 
MemberOf  (or  SubclassOf)  own  slot  of  a  frame  are  modified,  the 
inheritance  machinery  recomputes  the  resultant  sets  of  each  own 
slot  (or  prototype  slot)  of  the  frame.  When  a  resultant  set  of  a 
prototype  slot  is  modified,  affected  resultant  sets  of  all  its 
descendants  in  the  inheritance  graph  are  recomputed.  In  the 
paragraphs  below,  we  describe  how  each  type  of  resultant  set  is 
computed.  References  in  the  descriptions  to  values  and  exceptions 
are  to  the  resultant  sets  unless  explicitly  indicated  otherwise. 

The  set  of  resultant  necessary  values  for  a  prototype  slot  9  in 
a  class  frame  C  is  the  union  of  the  local  set  of  necessary  values  for 
S  in  C  and,  for  each  Csuper  that  is  a  value  of  the  own  slot 
SubclassOf  in  C,  the  set  of  necessary  values  for  prototype  slot  9  in 
Csuper. 

The  set  of  resultant  default  values  for  a  prototype  slot  9  in  a 
class  frame  C  consists  of  the  local  default  values  for  S  in  C  and,  for 
each  Csuper  that  is  a  value  of  the  own  slot  SubclassOf  in  C,  the 
default  values  for  prototype  slot  S  in  Csuper  that  do  not  match  an 
exception  for  9  in  C. 

The  set  of  resultant  values  for  an  own  slot  9  at  a  frame  F 
consists  of  the  local  values  for  9  at  F  and,  for  each  C  that  is  a 
value  of  the  own  slot,  MemberOf  in  F,  the  necessary  values  for 
prototype  slot  S  in  C  and  the  default  values  for  prototype  slot  .S'  in 
C  that  do  not  match  an  own  exception  for  5  in  /'. 

The  set  of  resultant,  exceptions  for  an  own  slot  9  in  a  frame  /' 
is  the  union  of  the  local  set  of  exceptions  for  9  in  /•’and,  for  each  C 
that  is  a  value  of  the  own  slot  MemberOf  in  /•',  the  set  of  exceptions 
for  prototype  slot.  9  in  C. 

The  set.  of  resultant  exceptions  for  a  prototype  slot  9  in  a  class 
frame  C  consists  of  the  local  instantiated  exceptions  for  9  in  C,  for 
each  Csuper  that  is  a  value  of  the  own  slot  SubclassOf  in  C,  the 
exceptions  for  prototype  slot.  9  in  Csuper .  and  each  ( V.CsuperQ j 
that,  matches  a  local  quantified  exception  for  9  in  C  and  is  a 
default  value  for  some  Csuper  I  that  is  a  value  of  the  own  slot 
SubclassOf  in  C 

Note  that  quantified  exceptions  remain  in  the  local  set  and  are 
not.  inherited.  Quantified  exceptions  produce  instantiated 
exceptions  as  needed  to  block  defaults  that  would  otherwise  be 
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inherited. 


1.  Example 

Figure  2  shows  the  local  and  resultant  values  and  exceptions 
produced  by  the  inheritance  algorithm  for  our  elephants  example. 
The  default  (Grey,Elephants)  at  Elephants  and  the  quantified 
exception  (*,*)  at  RoyalElephants  would  cause  an  instantiated 
exception  (Grey  .Elephants)  to  be  generated  at  RoyalElephants. 
That  instantiated  exception  would  be  irtherited  to  Clyde.  The 
exception  at  Clyde  would  block  inheritance  of  the  (Grey .Elephants) 
default  from  Elephants.  The  default  (White, RoyalElephants)  at 
RoyalElephants  would  be  inherited  to  Clyde  as  Clyde’s  color. 

Color 


Elephants 

4  t 

/  i 


Local  Defaults: 

(Grey,  Elephants) 
Resultant  Defaults: 
(Grey,  Elephants) 


J  Color 

^  Local  Exceptions:^,  *) 

I  Resultant  Exceptions: 

•  Royal  (Grey,  Elephants) 

]  Elephants  Local  Defaults: 

1  A  (Wh  ite,  RoyalElephants) 

i  Resultant  Defaults: 

|  (White,  RoyalElephants) 

v  '  Color 


\  . 
Clyde 


Resultant  Exceptions: 

(Grey,  Elephants) 

Resultant  Values; 

White 


Figure  2:  Inheritance  without  a  TMS 


IV.  Conclusion 

We  have  presented  a  formal  description  of  a  frame  language  that 
makes  a  clear  distinction  between  necessary  and  default  values  of 
prototype  slots.  The  formalization  is  based  on  previous  work  by 
Etherington,  but  extends  his  formalism  to  more  closely  match  the 
structure  of  frame  languages  and  to  allow  more  convenient 
overriding  of  defaults  at  superclasses  by  defaults  at  subclasses. 

We  have  presented  two  distinct  methods  for  implementing  the 
inferences  warranted  by  the  formal  description  of  the  frame 
language.  The  In  t  makes  use  of  nonmonotonic  justifications  in  a 
TMS  to  record  inUvcnees  corresponding  to  default  inheritance. 
This  method  is  suitable  for  situations  in  which  a  TMS  is  needed  in 
order  to  maintain  conclusions'  derived  from  non-inheritance 
inferences  or  to  implement  context-relative  inheritance.  The  second 
method,  in  eFfect,  implements  a  more  efficient,  special  purpose 
truth  maintenance  algorithm  in  order  to  maintain  the  validity  of 
inherited  values.  It.  is  appropriate  for  situations  in  which  a  general 
purpose  TMS  is  not  needed. 

A  topic  of  current  investigation  is  how  to  combine  the  two 
methods  into  a  single  system  in  which  the  special-purpose 
algorithm  is  used  whenever  possible.  In  many  applications,  general 
knowledge  about  the  relationships  among  classes  of  objects  in  the 
domain  and  default  values  of  prototype  slots  is  entered  directly  by 


the  domain  expert  and  does  not  vary  during  the  course  of  problem 
solving.  The  membership  of  individuals  in  classes  and  the  values  of 
own  slots  are  more  likely  to  be  inferred  during  problem  solving  and 
to  vary  with  hypothetical  context.  These  considerations  suggest 
that  the  special  purpose  algorithm  can  be  used  for  maintaining 
inherited  values  in  the  upper  regions  of  a  taxonomy,  with  the  TMS 
method  being  used  as  appropriate  in  the  lower,  more  problem- 
dependent  regions. 
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Abstract 

Intelligent  Systems  Engineering  is  characterized  by  the 
need  to  support  large-scale  applications,  the  reuse  of 
software  modules  and  capabilities,  intelligibility  of  both  a 
system’s  definition  and  its  operations,  and  the  integration 
of  a  system  with  its  external  hardware  and  software  en¬ 
vironment.  ABE  is  a  new-generation  software  architec¬ 
ture  that  supports  the  process  of  building  Intelligent 
Systems. 

We  describe  our  Intelligent  Systems  Engineering 
methodology,  and  how  features  of  ABE  support  it. 
Focusing  on  two  important  aspects  of  our  methodology, 
we  show  how  to  define  primitive  modules  and  abstract 
datatypes.  In  particular,  we  examine  the  importation  of 
foreign  code  and  data  structures  as  modules  and  abstract 
datatypes. 

1.  Introduction 

In  April  1986,  Teknowledge,  Inc.  demonstrated  a 
preliminary  version  of  ABE,  a  new  generation  software  ar¬ 
chitecture  for  building  intelligent  systems  Erman  86  .  In 
the  year  following,  we  have  worked  both  on  developing 
and  refining  the  software  itself,  and  on  clarifying  the 
problems  that  ABE  addresses.  This  report  addresses  both 
of  these  areas. 

Section  2  describes  our  Intelligent  Systems  Engineering 
methodology,  describing  in  detail  the  important 
properties  of  intelligent  systems,  requirements  for  an 
Intelligent  Systems  Development  Environment  and  ABF’s 

'This  is  an  early  description  of  in-progress  research.  The  ideas 
described  here  require  experimental  testing  and  will  likely  change. 
This  does  not  constitute  a  commitment  by  Teknowledge  to  any 
product  or  service.  ABE,  CORAL,  Module  Oriented-Programining, 
and  MOP  are  trademarks  of  Teknowledge,  Inc. 

This  research  is  partially  sponsored  by  the  Air  Force  Systems 
Command,  Rome  Air  Development  Center,  Griffiss  Air  Force  Base, 
NT  13411-5700  and  the  Defense  Advanced  Research  Projects  Agency, 
1400  Wilson  Blvd.,  Arlington,  VA  22209,  under  contract 
F30602-85-C-0135. 


methodology  for  the  design  of  intelligent  system.  We 
then  focus  on  two  important  aspects  of  building  intel¬ 
ligent  systems  in  ABE:  Section  3  describes  how  to  build 
primitive  modules,  particularly  ones  based  on  imported 
code,  while  Section  4  describes  ABE's  abstract  datatype 
mechanism.  The  rest  of  this  section  presents  a  brief 
review  of  the  Module-Oriented  Programming  methodol¬ 
ogy  that  ABE  implements,  and  finally  describes  the  ap¬ 
plication  we  use  as  the  source  of  the  examples  for  this 
report. 

1.1.  Review  of  Module-Oriented  Programming 

ABE  is  a  multi-level  software  architecture  for  building 
intelligent  systems.  It  implements  a  programming 
methodology  known  as  Module-Oriented  Programming 
(MOP).  In  its  simplest  form,  MOP  specifies  that  systems 
are  composed  of  a  number  of  modules  which  communi¬ 
cate  with  each  other.  Modules  can  be  either  primitive, 
with  no  internal  structure  visible  to  ABE,  or  composite, 
where  their  internal  structure  consists  of  a  number  of 
other  ABE  modules.  By  convention,  modules  communi¬ 
cate  with  each  other  by  passing  structured  data  in  the 
form  of  abstract  datatypes  (ADTs). 

Primitive  modules  are  built  in  one  of  a  number  of  sup¬ 
ported  languages.  Currently,  ABE  supports  many  dif- 

2  3 

ferent  languages  such  as  Common  LISP,  CORAL",  KEE  , 
Knowledge  Craft4,  and  MRS. 

Frameworks  are  special-purpose  languages  that  provide 
the  means  to  compose  modules.  Each  framework  imple¬ 
ments  a  particular  programming  metaphor,  such  as 
blackboards  or  dataflow,  by  providing  an  interpreter 
which  controls  the  execution  of  and  communication 
among  the  modules  composed  within  that  framework. 

“CORAL  is  an  object-oriented  programming  language  built  on  top 
of  Common  I.ISP  as  part  of  the  ABE  project 

3 

’  KEE  is  a  trademark  of  IntelliCorp. 

^Knowledge  Grafl  is  a  trademark  of  C  arnegie  Group  Inc. 
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vtop  requires  ail  modules  to  have  the  same  external 
form.  Modules  are  described  by  their  input/output  be¬ 
havior,  not  by  the  particular  language  or  framework  used 
to  construct  them.  This  property  allows  a  module 
builder  to  implement  the  internals  jf  a  module  using  the 
most  appropriate  language  or  framework  for  that 
module,  without  having  to  worry  about  how  it  will  be 
used.  In  particular,  a  composite  module  builder  can 
change  the  framework  used  for  the  composite  module  but 
continue  to  use  the  same  component  modules. 

MOP  defines  several  layers  on  top  of  the  basic  modules 
and  frameworks.  Skeletal  systems  are  configurations  of 
modules  that  define  a  problem-solving  structure,  such  as 
a  generic  replanning  system.  The  addition  of  a  domain- 
specific  customization  to  a  skeletal  system  results  in  a 
full-fledged  application.  For  a  more  complete  discussion 
of  MOP  see  jErman  86  . 

1.2.  AADS  —  An  Example  Application 

The  remainder  of  this  report  uses  examples  from  a 
demonstration  system  we  built  called  AADS,  or  the  ABE 
Anti-Submarine-  VV  arfare  Demonstration 

System  Hollander  86  .  We  developed  the  system  to 
demonstrate  ABE  applicability  in  the  Navy  Battle 
Management  arena,  particularly  for  the  CASES 
(Capabilities  Assessment)  system  SPAWARSYSCOM  85:. 
While  AADS  addresses  only  a  small  subset  of  the  CASKS 


problem  in  a  very  superficial  way,  it  does  illustrate  some 
important  features  of  .ABE. 

The  AADS  system  models  an  idealized  anti-submarine- 
warfare  (ASW)  campaign.  In  this  model,  an  ASW  cam¬ 
paign  consists  of  four  stages:  underwater  mines,  patrol 
aircraft,  independent  submarines,  and  aircraft  carrier 
battle  groups.  Each  stage  operates  more  or  less  indepen¬ 
dently,  and  the  total  effectiveness,  in  terms  of  percentage 
of  enemy  submarines  destroyed,  is  the  product  of  each 
stage's  individual  effectiveness. 

We  have  built  a  model  of  the  ASW  campaign,  which 
takes  as  inputs  force  levels  for  the  various  stages  ( e.g ., 
number  of  mines,  duration  of  aircraft  patrol,  etc.),  and 
produces  as  outputs  a  measure  of  the  overall  effectiveness 
and  a  relative  cost  measure.  Separate  from  the  ASW 
model,  we  have  implemented  a  simple  hill-climbing 
search  routine  (the  Satisficer)  that,  given  an  initial  con¬ 
figuration  of  force  levels  and  a  utility  function,  searches 
for  a  new  set  of  force  levels  that  optimizes  that  utility 
function.  The  utility  function  consists  of  a  number  of 
semi-independent  preference  measures,  such  as  expressing 
preferences  for  or  against  any  specific  stage  or  setting  a 
maximum  cost  target.  Finally,  we  have  built  a  number 
of  special-purpose  graphical  and  tabular  interfaces  to  the 
system,  which  allow  the  user  to  interact  in  a  spreadsheet¬ 
like  manner  with  the  system. 


Figure  1-1:  Top-level  view  of  ABE  \SW  Demonstration  System  (AADS) 


Figure  1-1  shows  a  view  of  the  main  AADS  modules, 
composed  using  ABE’s  DF  dataflow  framework.  The  rec¬ 
tangular  boxes  represent  modules,  while  the  ellipses 
represent  places  that  can  hold  ADTs  Figure  1-2  shows  the 
interface  to  the  AADS  ASW  model,  which  consists  of  a 
number  of  graphs  that  the  user  can  manipulate,  along 
with  various  derived  attributes  such  as  cost  and  effective¬ 
ness.  and  Figure  1-3  shows  the  force  preferences  inter¬ 
face  and  the  results  table.  The  force  preference  interface 
consists  of  a  number  of  gauges  that  control  the  various 
aspects  of  the  utility  function.  The  results  table  displays 
one  row  for  each  run  of  the  Satisficer,  showing  the  initial 
force  levels,  the  utility  function  coefficients,  and  the 
recommended  changes  to  the  initial  force  levels. 

AADS  is  an  initial  version  of  a  decision-support  tool  for 
planning  ASW  campaigns.  It  allows  a  user  to  select  an  in¬ 
itial  force  preference  configuration  and  determine  its  cost 
and  effectiveness.  It  also  allows  a  user  to  search  for  a 
“best”  configuration,  where  the  user  can  define  his  own 
version  of  “best”.  We  have  constructed  AADS  as  a 
testbed  and  demonstration  for  ABE.  In  particular,  we 
have  demonstrated  the  ability  to  design  a  system  using 
multiple,  independent  frameworks  for  combining 
modules,  to  use  both  hierarchical  and  non-hierarchical 
communication  among  the  modules,  and  to  interface  to 
modules  implemented  in  external  languages  such  as  KEE 
and  Knowledge  Craft. 

2.  Intelligent  Systems  Engineering 

This  section  describes  our  view  of  the  Intelligent 
Systems  Engineering  process.  We  first  define  intelligent 
systems  and  intelligent  systems  development  environ¬ 
ments,  and  then  describe  our  system  development 
methodology,  cone]  iding  with  a  discussion  of  ABE  fea¬ 
tures  that  support  tnat  methodology. 

2.1.  Characteristics  of  Intelligent  Systems 

Intelligent  systems  comprise  both  knowledge-based  and 
conventional  software  components,  and  function  in  an  in¬ 
tegrated  fashion  with  their  surrounding  environment.  In 
contrast  to  conventional  expert  or  knowledge  systems,  in¬ 
telligent  systems  do  not  have  a  single  “knowledge”  com¬ 
ponent  with  which  the  rest  of  the  system  interacts. 
Instead,  intelligent  systems  contain  many  modular 
capabilities,  each  of  which  may  contain  knowledge-based 
components. 

An  intelligent  system  development  environment  (1SDE) 
is  a  software  environment  that  supports  the  construction 


of  intelligent  systems.  ISDEs  support  the  process  of  build¬ 
ing  intelligent  systems  by  providing 

•  an  architecture  and  methodology  for  describ¬ 
ing  and  building  intelligent  systems; 

•  tools  for  describing  system  requirements  and 
designs  from  multiple  perspectives; 

•  interactive  programming  environment(s),  in¬ 
cluding  interpreters,  debuggers,  and  browsers; 

•  support  for  the  evolution  of  systems,  espe¬ 
cially  including  performance  tuning  and  com¬ 
pilation;  and 

•  tools  for  managing  the  complexity  of  the 
software  development  process,  including  tools 
for  abstracting  module  and  system  descrip¬ 
tions,  automatic  cataloging  and  retrieval  of 
modules,  and  source  and  configuration  control 
across  multiple  machines  and  developers. 

The  following  paragraphs  describe  the  important 
characteristics  of  intelligent  systems  and  present  some 
detailed  requirements  for  an  ISDE.  We  also  indicate  some 
of  the  features  that  ABE  provides  tha!  satisfy  these  re¬ 
quirements. 

2.1.1.  Scale 

Intelligent  systems  address  problems  of  a  very  large 
scale.  Large-scale  can  refer  to  many  different  measures 
of  a  system: 

•  the  size  of  the  knowledge  or  databases  th 
system  uses; 

•  the  number  and  variety  of  different  functions 
the  system  must  perform; 

•  the  number  of  different  components  that  com¬ 
prise  the  system:  and 

•  the  complexity  of  the  interactions  among  com¬ 
ponents. 

ISDEs  require  tools  and  methodologies  for  dealing  with 
the  increased  complexities  of  these  massive  systems.  ABE 
supports  hierarchical  structuring  of  systems  with  as¬ 
sociated  graphical  inspection  tools  which  allow  one  to 
manage  system  complexity.  In  Phase  2  we  will  provide 
additional  facilities  to  address  the  problem  of  scale,  in¬ 
cluding  integrating  a  commercial  relational  DBMS  with 
ABE  and  developing  techniques  for  arbitrarily  compiling 
away  unnecessary  structure  to  support  high-performance 
applications.  For  instance,  we  will  support  the  compila¬ 
tion  of  hierarchically-structured  systems  described  above. 
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2.1.2.  Reusability 

The  large-scale  systems  described  above  cannot  in 
general  be  built  from  scratch.  System  developers  will 
need  to  reuse  existing  code,  algorithms,  and  architectures 
in  order  to  capture  poorly-understood  techniques,  in¬ 
crease  the  efficiency  of  the  system  development  process, 
and  produce  well-tested  and  reliable  systems.  ISDEs  must 
provide  a  standard  architecture  that  both  supports  the 
reuse  of  existing  code  and  provide  for  newly  developed 
code  to  be  reused  in  later  applications.  This  same  ar¬ 
chitecture  can  support  the  evolutionary  development  of 
an  intelligent  system,  which  can  be  viewed  as  the  reuse  of 
the  subcomponents  through  different  phases  of  the  sys¬ 
tem  development  lifecycle. 

By  supporting  the  definition  of  standard  interfaces,  ABE 
allows  a  module  builder  to  create  modules  that  can  be 
cataloged  and  reused  in  new  applications.  Over  time,  the 
catalog  will  grow  to  contain  modules  of  proven  utility, 
reliability,  and  performance. 

2.1.3.  Intelligibility 

By  intelligibility  we  refer  to  two  distinct  capabilities. 

•  In  classical  knowledge  engineering  terms,  a 
system  should  have  the  capability  to  make  its 
actions  and  reasoning  understandable  to  the 
current  task  and  user. 

•  From  a  software  engineering  viewpoint,  a  sys¬ 
tem  should  have  a  clear  and  understandable 
system  definition,  to  support  maintenance  and 
evolution. 

ABE  provides  the  structure  to  incorporate  the  best  ideas 
in  system  explanation,  such  as  those  coming  out  of  the 
Strategic  Computing  Initiative  Knowledge-Based  Systems 
program  Chandrasekaran  86  .  We  also  have  con¬ 
centrated  on  making  the  system  definitions  open  and  in¬ 
telligible  through  the  use  of  special-purpose  graphics, 
browsers,  and  inspectors. 

2.1.4.  Integration 

Finally,  intelligent  systems  should  support  many  dif¬ 
ferent  kinds  of  integration:  integrating  knowledge-based 
and  conventional  components,  integrating  components 
written  in  different  languages,  and  integrating  systems 
across  a  distributed  heterogeneous  computing  environ¬ 
ment.  ISDEs  must  provide  support  for  specifying  stan¬ 
dard  interfaces  to  software  components. 

We  have  already  demonstrated  the  integration  of  con¬ 
ventional  components  with  knowledge-based  components, 


and  the  integration  of  components  written  in  different 
languages.  In  Phase  2  we  will  build  a  distributed 
processing  infrastructure  on  top  of  the  MACH  distributed 
opeiating  system  Rashid  86  ,  and  upgrade  many  ABE 
frameworks  to  work  in  a  distributed  environment. 

2.2.  System  Design  Methodologies 

ABE  takes  a  hybrid  view  of  the  Intelligent  Systems 
Engineering  process.  We  see  system-building  happening 
in  both  a  top-down  and  bottom-up  fashion  simul¬ 
taneously,  with  explicit  support  for  evolving  a  system 
design  as  system  requirements  or  problem  understanding 
change.  The  top-down  component  consists  of  analyzing  a 
system's  function(s),  and  allocating  those  functions  to 
various  functional  modules.  The  design  at  this  stage  also 
requires  some  notion  of  control  and  communication 
among  these  modules.  .ABE  frameworks  embody  par¬ 
ticular  control  and  communication  design  decisions,  and 
the  system  designer  selects  a  framework  appropriate  to 
the  task  and  modules  at  hand.  For  instance,  if  the  sys¬ 
tem  designer  knows  that  the  functional  modules  will 
want  to  pass  control  and  communicate  data  in  a  simple 
dataflow  manner,  the  designer  would  select  the  DF 
(dataflow)  framework  to  describe  the  modules’  inter¬ 
actions. 

Given  a  decomposition  of  the  system  into  a  number  of 
independent  modules,  the  system  designer  can  apply  the 
same  decomposition  process  recursively  to  the  component 
modules.  Control  and  communication  design  decisions 
made  at  higher  levels  have  no  effects  on  decisions  at 
lower  levels.  This  property  allows  the  system  designer  to 
select  the  most  appropriate  ABE  framework  to  implement 
any  module.  This  recursive  decomposition  continues  un¬ 
til  the  designer  reaches  primitive  modules  —  either  exist¬ 
ing  modules  from  the  ABE  ea.talog,  or  modules  that  will 
be  built  from  scratch  in  a  programming  language  such  as 
LISP,  C,  or  KEE. 

We  have  intentionally  not  committed  to  n  specific 
decomposition  methodology,  such  as  SADT.  We  believe 
that  over  time  each  framework  will  acquire  its  own 
design  methodology  or,  conversely,  frameworks  will  be 
built  to  support  standard  methodologies.  For  instance, 
we  are  investigating  the  use  of  the  blackboard 
metaphor  Erman  80  as  a  system  design  technique,  with 
the  BBD  framework  providing  an  environment  for  im¬ 
mediately  testing  new  designs. 

in  parallel  with  the  top-down  decomposition  process, 
ABE  also  supports  a  bottom-up  synthesis  process. 
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Starting  with  existing  modules  in  the  ABE  catalog,  system 
builders  can  compose  new  modules  with  higher-level 
functionalities.  These  new  modules  can  themselves  be 
placed  in  the  catalog,  for  reuse  in  still  higher-order 
modules. 

ABE  supports  a  prototyping  approach  to  module  syn¬ 
thesis.  System  designers  can  rapidly  mock-up  systems  in 
an  executable  design  framework,  which  provides  for  im¬ 
mediate  feedback  to  the  designer.  ABE  provides  the  fol¬ 
lowing  features  which  support  this  prototyping  activity: 

•  Interactive  interpreters,  editors,  and  debug¬ 
gers  for  each  of  the  system  design 

frameworks. 

•  Interactive  catalog  browsers,  for  searching  a 
library  of  existing  modules  to  find  one  ap¬ 
propriate  to  a  specific  application. 

•  Support  for  module  stubs  and  delayed  binding 
of  modules  and  datatypes,  which  allows  a 
designer  to  focus  on  one  level  at  a  time. 

•  Ability  to  change  frameworks  as  the  system 
designer  comes  to  understand  the  mod  Be  in¬ 
teractions  better,  without  having  to  change 
the  modules  that  comprise  the  system. 

•  Ability  to  save  partial  modules  in  the  catalog 
where  they  can  be  retrieved  later  for  ad¬ 
ditional  refinement,  or  used  as  components  in 
other  systems. 

The  bottom-up  approach  combined  with  the  top-down 
approach  described  above  together  permit  system  desig¬ 
ners  and  builders  to  converge  rapidly  on  a  working 
prototype  system.  ABE  provides  or  will  provide  support 
for  the  evolution  of  that  prototype  to  a  full  operational 
system: 

•  Many  of  the  features  listed  above  for 
prototyping,  such  as  changing  frameworks. 

•  Provisions  for  adding  non-hierarchical  com¬ 
munication  between  modules. 

•  Vdvaneed  compilation  techniques  to  collapse 
embedded  systems  and  remove  levels  of  inter- 
p  retation. 

3.  Modules 

This  section  briefly  describes  how  a  module  builder 
implements  primitive  and  composite  modules  in  ABE. 
V\e  first  look  at  primitive  modules,  focusing  on  the 
specification  of  the  I  O  behavior.  We  then  describe  how 
to  import  foreign  capabilities  as  modules,  and  list  some  of 


the  problems  to  expect  in  importing  foreign  code  and 
methods  to  minimize  their  impact. 

3.1.  Primitive  Modules 

ABE  modules  can  be  either  primitive  or  composite. 
Fnmitive  modules  have  no  ABE-defined  internal  struc¬ 
ture.  They  form  the  lowest  level  building  blocks  of  ABE 
systems.  The  behavior  of  a  primitive  module  is  supplied 
by  a  piece  of  code  written  in  one  of  the  ABE-supported 
programming  languages,  such  as  CORAL,  Common  LISP, 
or  KEE.  When  the  programmer  supplies  that  code  with 
the  rest  of  the  module  definition  we  refer  to  that  module 
as  a  black-box  module.  When  the  bulk  of  that  code  is 
supplied  by  an  existing  piece  of  foreign  code,  we  refer  to 
the  module  as  an  importer  module.  The  distinction  ex¬ 
ists  only  as  an  annotation  to  users  to  indicate  that  a 
module  imports  some  external  code;  /ABE  does  not  distin¬ 
guish  these  two  kinds  of  primitive  modules. 

A  primitive  module  specification  has  four  main  parts:  a 
definition  of  the  I/O  behavior  of  the  module,  a  definition 
of  the  function  the  module  computes,  definitions  of  other 
operations,  and  documentation  and  other  annotations. 
The  I/  O  specification  consists  of  descriptions  of  a  number 
of  ports.  Modules  receive  and  transmit  data  through 
ports.  Ports  can  be  designated  as  input,  output,  or 
bidirectional,  and  can  handle  data  synchronously  or 
asynchronously  with  respect  to  the  execution  of  the 
module.  In  addition,  the  module  builder  can  attach  type 
information  to  a  port,  indicating  the  allowable  datatypes 
that  can  How  through  that  port. 

Figure  3-1  shows  the  definition  of  the  AADS  force 
preference  interface  module.  The  :  in  and  :0UT  ar¬ 
guments  define  the  input  and  output  arguments  of  the 
module,  respectively.  Note  that  the  output  argument 
preferences  has  an  explicit  type  declaration. 

The  function  definition  of  a  primitive  module,  also 
known  as  its  body,  consists  of  a  declaration  of  a  set  of  lo¬ 
cal  state  variables  and  a  procedure  definition.  ABE  sup¬ 
ports  both  dynamic  local  variables  and  persistent  local 
variables,  i.e..  variables  that  retain  their  state  between 
invocations.  The  procedure  definition  determines  the  be¬ 
havior  of  the  module.  It  can  bo  written  in  any 
ABE-supported  language,  e.i/..  Common  LIM\  It  can 
freely  refer  to  any  local  variables,  and  can  also  access  and 
store  values  in  ports. 

Referring  bark  to  Figure  3-1,  the  :  ivars  argument 
defines  a  single  persistent  local  variable  Frame,  which 
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points  to  the  preference  interface  window.  The  :  EXECUTE 
argument  specifies  the  procedure  to  invoke  when  this 
module  is  called.  The  call  tc  create-adt  creates  an 
abstract  datatype,  using  slot  values  from  the  user-defined 
(k.  preferences  k:  kee-asw)  unit  to  serve  as  initial 
values  for  the  ADT's  slots.  Finally,  the  RETURN-OUTPUT 
function  returns  this  new  ADT  as  the  value  of  the 
preferences  output.  See  Section  4  for  more  details  on 
ADTs. 

Modules  can  have  more  than  one  primary  function. 
For  example,  a  module  may  perform  widely  different 
operations  depending  on  which  input  ports  have  data 
when  he  module  is  invoked.  This  case  occurs  often 
when  a  single  moduie  instance  occurs  in  more  than  one 
framework  simultaneously.  We  are  investigating  various 
alternatives  for  describing  the  different  functions  a 
module  may  perform. 

A  module  must  support  other  operations  besides  its 
primary  function(s).  ABE  includes  provisions  for  initializ¬ 
ing  the  persistent  state  of  a  module,  and  for  resetting 
part  of  that  state.  We  call  the  initialization  operation 


customization.  Our  intuitive  definition  of  module  cus¬ 
tomization  is  the  specialization  of  a  generic  module,  espe¬ 
cially  for  a  specific  application  domain.  The  module 
builder  must  supply  a  procedure  for  customizing  a 
module  for  each  specific  domain  the  module  wi,l  operate 
in.  The  customization  operations  may  consist  of  preload¬ 
ing  a  database,  defining  domain-specific  terms  and 
operators,  configuring  special-purpose  user  interfaces,  or 
other  arbitrary  operations. 

ABE  also  defines  two  reset  operations.  A  “soft”  reset 
clears  any  execution-specific  data  in  a  module,  to  prepare 
it  for  operating  on  a  new  set  of  input  data.  A  “hard” 
reset  performs  all  the  operations  of  a  soft  reset,  and  also 
clears  any  customizations  in  effect.  If  appropriate,  a 
module  builder  may  supply  procedures  to  implement 
these  operations. 

Figure  3-2  shows  the  definition  of  the  hard  reset  code 
for  the  preference  interface  module.  This  code  will  create 
a  new  interface  (KEE  desktop  and  windows)  when  in¬ 
voked  by  calling  the  user-defined  function 
LOAD-KEE-INTERFACE. 


(def module  AS  W-PREFERENCE- INTER FACE-KEE 
C : IN  go) 

( : OUT  (preferences  (.-type  asw-pref erences) ) ) 

( : I VARS 
(Frame 

.-settable  (:default-lnlt  nil) 

(: documentation  "The  KEE  desktop  that  holds  the  Actlvelmages") ) ) 
(-.EXECUTE 

(declare  (Ignore  go)) 

"Return  the  preferences  specified  by  the  user’s  menu  settings.” 
(RETURN-OUTPUT 
■preferences 

(CREATE-ADT  ’asw-pref erences-kee 

:kee-unlt  (k:unlt  ' (k : :pref erences  k  :: kee-asw) ))) ) 

(: DESCRIPTION  "The  ABE/ASW  Preference  Interface  -  KEE  version.") 
(.-DOCUMENTATION 

"The  preference  Interface  Implemented  by  a  set  of  KEE 
Actlvelmages.  The  user  can  freely  Interact  and  modify  the  values 
of  the  preference  gauges.  When  called,  this  module  looks  at  the 
values  In  the  gauges  and  returns  an  abstract  datatype  that  contains 
_ all  those  values.")) _ 

Figure  3-1:  Definition  of  the  AADS  Preference  Interface  module 


(defresponse  (asw  preference-interface  -kee  :RESET) 

"Create  a  new  KEE  desktop  and  actuators." 

(setq  frame  (load  kee  -Interface) ) 

(send  frame  -.bury)) 

Figure  3-2:  Reset  Codt  for  the  AADS  Preference  Interface  module 
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Module  builders  can  also  specify  annotations  and 
documentation  for  a  primitive  module.  Current  annota¬ 
tions  include: 

•  A  one-line  documentation  string,  which  gives 
a  brief  description  of  the  function  of  the 
module. 

•  A  multi-line  documentation  string,  which 
describes  the  operation  of  the  module  in 
detail. 

•  A  graphical  name  and  associated  formatting 
information,  which  determine  how  ABE  for¬ 
mats  the  name  of  the  module  on  the  computer 
display. 

•  A  graphical  icon,  which  provides  a  quick 
visual  indication  as  to  the  identity,  function, 
or  operation  of  the  module. 

The  DESCRIPTION  and  DOCUMENTATION  arguments  in 
Figure  3-1  show  the  short  and  long  documentation  for 
the  force  preference  module. 

3.2.  Importer  Modules 

As  described  above,  an  importer  module  is  a  type  of 
primitive  module  in  which  a  piece  of  foreign  code 
provides  some  functionality  that,  together  with  user- 
written  code,  defines  the  behavior  of  the  module.  We  as¬ 
sume  that  the  module  builder  is  unlikely  or  unable  to 
make  modifications  to  the  foreign  code,  e.g.,  he  may  not 
have  access  to  the  source  code.  However,  we  do  assume 
the  module  builder  has  a  set  of  documented  entry  points 
into  the  code. 

The  basic  steps  needed  to  import  foreign  code  into  ABE 
are: 

1.  Identify  the  functions  required  of  the  foreign 
code,  and  the  entry  points  that  provide  that 
functionality. 

2.  Determine  the  input  and  output  data  require¬ 
ments  for  those  entry  points,  i.e.,  what  ar¬ 
guments  does  the  foreign  code  expect,  and 
what  values  does  it  return.  This  may  require 
the  definition  of  new  ADTs  to  interface  to  in¬ 
ternal  data  structures  in  the  foreign  code. 

3.  Define  an  importer  module  to  act  as  a 
transducer  for  the  entry  points  by  implement¬ 
ing  foreign  code  calling  conventions,  convert¬ 
ing  data,  and  specializing  the  module  to  run 
in  the  ABE  environment. 


The  module  builder  must  also  consider  the  following 
properties  of  the  foreign  code  and  the  effect  they  will 
have  on  any  ABE  modules  that  include  the  imported 
code: 

•  reentrancy  of  the  foreign  code; 

•  user  interface  to  the  foreign  code;  and 

•  unintended  interactions  with  other  modules, 
or  with  the  rest  of  the  native  computing  en¬ 
vironment. 

The  result  of  the  importation  process  is  a  black-box 
module  which  can  then  be  used  like  any  other  ABE 
module.  The  rest  of  this  section  describes  three  impor¬ 
tation  process  in  more  detail. 

3.2.1.  Identifying  Entry  Points 

As  stated  above,  each  module  generally  performs  a 
single  function  or  a  set  of  related  functions  which  vary 
according  to  the  module’s  input  data.  The  module 
builder  must  first  identify  the  functions  expected  of  the 
foreign  code.  Each  function  should  specify  an  operation 
to  be  performed  on  a  set  of  input  data,  resulting  in  a  set 
of  output  data.  In  some  cases,  the  function  may  not  con¬ 
sume  any  inputs  or  may  not  produce  any  outputs.  The 
specification  should  also  state  whether  the  data  will  be 
synchronous  or  asynchronous  with  respect  to  the  invoca¬ 
tion  of  and  return  from  the  foreign  code. 

The  module  huilder  must  next  determine  the  entry 
points  that  implement  each  desired  function.  Ideally, 
there  exists  a  one-to-one  mapping  from  functions  to  entry 
points.  In  cases  in  which  this  condition  does  not  hold, 
the  module  builder  can  apply  a  number  of  techniques. 
The  first  calls  for  the  module  builder  to  look  for  a  num¬ 
ber  of  entry  points  that  each  provide  a  piece  of  the 
desired  functionality,  and  then  combine  them  into  a 

single  importer  module.  Note  that  the  aggregation  of  the 
entries  can  provide  extra  functionality  beyond  that 

desired:  the  importer  module  can  ignore  the  extraneous 
functionality. 

If  no  combination  of  existing  entries  supplies  the 

desired  functionality,  the  module  builder  can  try  to 

create  a  new  entry  point.  This  generally  requires  access 
to  the  sources  for  the  foreign  code  and  an  understanding 
of  the  internal  structure  and  behavior  of  the  foreign  code. 

A  third  alternative  applies  if  an  entry  point  supplies  a 
critical  subfunction.  In  this  rase,  the  module  builder  an 
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build  the  missing  functionality  into  the  importer  module, 
or  can  define  a  composite  module  that  includes  the 
foreign  code  as  one  module,  along  with  other  ABE 
modules  to  supply  the  missing  functionality.  If  this  step 
fails,  the  module  builder  must  reexamine  the  intended 
use  of  the  foreign  module. 

3.2.2.  Input/Output  Requirements 

The  module  builder  must  now  identify  the  data  require¬ 
ments  for  the  new  module.  This  involves  determining  the 
content  of  input  and  output  data,  as  well  as  the  timing 
requirements  for  that  data. 

ABE  modules  communicate  with  each  other  through 
ADTs.  ABE  provides  mechanisms  for  defining  both  the 
logical  and  physical  structure  of  ADTs.  Modules 
uniformly  access  and  modify  ADTs  by  sending  them  mes¬ 
sages.  See  Section  4  for  details. 

In  order  to  import  foreign  code,  the  module  builder 
must  identify  or  define  ADTs  that  support  the  I/O  re¬ 
quirements  of  the  importer  module.  Usually,  external 
considerations  dictate  the  logical  structure  of  these  ADTs, 
e.g ,  the  importer  module  has  to  Interface  with  other 
modules  with  pre-established  1  O  requirements. 

Another  point  to  consider  in  defining  ADTs  is  their 
physical  structure.  The  module  builder  can  specify  that 
a  data  structure  in  the  foreign  code  provides  the  physical 
definition  of  an  ADT.  This  requires  knowledge  of  and  ac¬ 
cess  to  the  foreign  data  structures,  but  can  simplify  the 
job  of  the  importer  module  in  doing  data  conversions,  in¬ 
creasing  the  program's  efficiency  at  the  same  time. 
Section  1  describes  this  process  in  more  detail. 

Figure  3-3  shows  the  definition  of  the  AADS  force 
preferences  abstract  datatype.  The  .-IMPLEMENTATION  ar¬ 
gument  specifies  the  name  of  an  ABE-supplied  ADT 
storage  method;  this  ADT  will  interface  with  KEE  and  ac¬ 
cess  values  inside  KEE  Units  automatically.  This  inter¬ 
face  frees  the  force  preference  module  shown  in  Figure 
3-1  from  worrying  about  data  formats  inside  KEE. 

In  addition  to  the  structure  of  the  data,  the  module 
builder  must  determine  the  timing  characteristics  of  the 
data.  Generally,  a  module  receives  all  of  its  input  data 
when  invoked  and  returns  all  of  its  output  data  when  it 
returns.  We  refer  to  this  method  of  passing  data  as 
synchronous.  ABE  also  supports  asynchronous  data  ex¬ 
change,  where  an  already  executing  module  can  send 
messages  to  other  modules  to  send  or  receive  data.  The 


AADS  Satisficer  module  interacts  in  this  manner  with  the 
ASW  Mode]  module,  using  the  TX  transaction  framework 
to  specify  their  interactions.  Figure  3-4  shows  this  TX- 
defined  composite  module,  with  the  ASW  Model  module 
acting  as  a  server  for  the  AIDS  Satisficer  module. 


(defadt  ASW-PREFERENCES-KEE 
"Define  an  ADT  that  can  access  the  values 
of  the  preference  settings." 

(: IMPLEMENTATION  -.kee-copy) 

(: SLOTS 

Or: : Cost-Or-Ef f ectlveness 
(:to-read  :drlvlng  goal)) 

(k : : Inltlal-Enemy-Subs 
(:to-read  :number-subs)) 

(k; : Acceptable -Survlvlng-Subs 
Uto-read  :  survlvlng-subs) ) 

Oc ; • Cost-Goal 

(:  to-read  -.  raw-cost-goal) ) 

Oc: :Cost-Eff ectlveness) 

Oc: .Early-Kllls) 

Oc:  : Balanced-Changes) 

Oc:  : Minimal-Changes) 

Oc:  .-Mines) 

Oc: :Alrcraft) 

Oc:  : Submarines) 

Oc : : Battle -Groups) ) ) 


Figure  3-3:  Definition  of  Preferences  abstract  datatype 
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Figure  3-4:  Transaction  connection  between  AADS 
Satisficer  arid  Model  modules 
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3.3.  Defining  the  Importer  Module 

The  module  builder  can  now  define  the  importer 
module.  As  described  earlier,  an  importer  module  is  just 
a  special  case  ot‘  a  primitive  module.  The  module  body 
of  an  importer  aggregates  and  sequences  foreign  code 
entry  points,  implements  any  necessary  data  transfor¬ 
mations,  and  passes  data  into  and  out  of  the  foreign 
code. 

The  module  builder  must  also  pay  attention  to  several 
other  subtle  issues  which  can  complicate  the  importation 
process.  These  generally  involve  unforseen  interactions 
between  the  imported  code  and  the  rest  of  the  ABE  en¬ 
vironment. 

3.3.1.  Reentrancy 

The  ABE  model  specifies  that  a  number  of  instances  of 
a  given  module  can  exist  simultaneously.  Standard  .ABE 
frameworks  produce  reentrant  composite  modules,  so 
multiple  instances  do  not  present  any  problem.  However, 
ABE  cannot  guarantee  that  imported  foreign  code  will  be 
reentrant. 

Given  a  piece  of  non-reentrant  foreign  code,  the  module 
builder  has  three  alternatives:  modify  the  foreign  code  to 
make  it  reentrant,  patch  around  the  foreign  code  to  make 
it  appear  reentrant  from  the  outside,  or  ensure  that  no 
more  than  one  instance  of  the  module  is  active  at  one 
time. 

The  first  alternative  requires  the  module  builder  have 
access  to  sources  and  understand  the  foreign  code  enough 
to  modify  it  correctly.  The  second  method  requires  the 
module  builder  to  identify  the  global  objects  used  by  the 
foreign  rode  and  create  a  copy  of  them  for  each  instance 
of  the  importer  module,  storing  them  in  the  importei 
module's  ivars.  The  third  alternative  requires  coopera¬ 
tion  among  thp  modules  that  will  interact  with  the  im¬ 
porter  module,  as  ABE  currently  provides  no  such  sup¬ 
port. 

3.3.2,  User  Interface 

ABE  provides  a  standard  system  developer  user  inter¬ 
face  for  displaying  the  operation  of  composite  modules. 
It  does  not  provide  any  standard  facilities  for  displacing 
the  operation  of  primitive  modules.  When  the  foreign 
code  needs  to  communicate  with  the  user,  it  can  use  its 
own  user  interface. 

Problems  can  arise  when  the  foreign  code’s  user  inter¬ 
face  makes  assumptions  about  the  processing  environ¬ 


ment  in  which  it  runs.  For  instance,  if  the  foreign  code 
assumes  it  has  free  access  to  write  to  the  screen  or  read 
from  the  keyboard,  the  native  window  system  can  easily 
become  confused. 

At  other  times,  the  module  builder  will  wish  to  suppress 
the  foreign  code’s  user  interface  entirely.  Rebinding  out¬ 
put  streams  that  the  foreign  code  uses  to  a  null  stream 
provides  a  quick  method  for  ignoring  all  output  from  the 
foreign  code. 

We  plan  to  adopt  a  standard  for  user  interfaces  in  the 
near  future.  This  standard  will  include  prescriptions  for 
user  interfaces  for  imported  code. 

3.3.3.  Virtual  Memory  Considerations 

The  current  implementation  of  ABE  runs  on  a 
Symbolics  3600  series  computer.  All  software,  including 
the  kernel  ABE  system,  all  ABE  modules,  and  all  foreign 
code,  must  run  in  the  same  virtual  memory.  The 
Common  LISP  package  system  provides  a  means  for  mul¬ 
tiple  programs  to  share  the  same  virtual  memory  without 
having  name  conflicts.  However,  the  package  system 
does  not  prevent  conflicts  of  package  names.  These  often 
occur  for  package  nicknames  which  can  be  one-  or  two- 
character  strings.  If  the  module  builder  wishes  to  load 
foreign  code  that  has  a  package  name  conflict  with  an  ex¬ 
isting  package,  the  existing  package  must  be  renamed  to 
avoid  conflict  with  the  new  package.  The  code  in  the 
renamed  package  will  still  continue  to  function  properly, 
but  editing  it  can  cause  problems. 

Care  must  also  be  taken  to  watch  for  other  conflicts 
among  globally  shared  resources.  Examples  include 
readtable  definitions  and  read  macros,  Select  key  com¬ 
mands,  command  processor  commands,  editor  key  bind¬ 
ings,  and  other  environment  customizations. 

4.  Abstract  Datatypes 

The  MOP  model  calls  for  modules  to  communicate  with 
each  other  by  ex  hanging  structured  data.  This  data 
takes  the  form  of  abstract  datatypes  (ADTs).  Modules 
take  .ADTs  as  input  and  return  ADTs  as  outputs.  In  ad¬ 
dition,  ABE  uses  ADTs  to  act  as  interfaces  to  foreign  lan¬ 
guages  and  modules. 

ABE  provides  the  programmer  flexibility  in  defining  the 
physical  storage  of  ADTs.  The  ABE  ADT  facility  provides 
mechanisms  for  defining  new  storage  methods  for  ADTs, 
particularlv  for  accessing  data  structures  defined  by 
foreign  code  or  languages.  By  default,  ABE  implements 
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ADTs  as  CORAL  instances.  An  ABE  user  should  not  re¬ 
quire  other  implementation  methods  unless  he  has  to  in¬ 
tegrate  with  foreign  ~ode. 

4.1.  Logical  Definition 

An  ADT  is  an  object  which  logically  comprises  a  number 
of  named  slots.  Each  slot  contains  a  single  value  or  a 
number  of  values.  Values  are  LISP  objects,  including 
other  ADTs.  An  ADT  may  specify  a  type  for  its  slot 
values.  An  ADT  generally  supports  read  and  write  opera¬ 
tions  on  its  slots;  the  programmer  can  selectively  declare 
slots  read-only. 

ABE  also  supports  a  special  type  of  slot  called  a  link. 
Links  are  like  slots,  except  that  ABE  constrains  their 
value  to  be  another  ADT  instance  or  list  of  instances. 
Furthermore,  each  instance  pointed  to  must  have  a  link 
pointing  back  to  the  original  .ADT.  We  refer  to  that 
second  link  as  a  back  pointer.  Links  automatically  main¬ 
tain  the  interconnections  between  complex  structured  ob¬ 
jects. 

ADTs  are  persistent,  mutable  objects.  Modules  access 
and  modify  the  values  of  an  ADT’s  slots  by  sending  it 
messages.  In  response  to  these  messages,  an  ADT  can 
change  state,  send  other  messages  to  other  related  ADTs, 
or  return  a  value  based  on  its  slot  values. 

4.2.  Physical  Definition 

ABE  provides  the  programmer  flexibility  in  defining  the 
physical  storage  of  ADTs.  The  ADT  facility  provides 
mechanisms  for  defining  new  storage  methods  for  ADTs, 
particularly  for  accessing  data  structures  defined  by 
foreign  code  or  languages,  e.g.,  KEE  Units.  It  also  allows 
the  programmer  to  specify  access  schemes  for  reading 
data  from  the  foreign  data  structure. 

By  default,  ABE  allocates  storage  for  an  ADT's  slots 
within  the  /ADT  instance.  In  this  case,  we  say  that  the 
ADT  instance  caches  its  slot  values.  However,  a  program¬ 
mer  can  specify  alternate  storage  for  slot  values,  such  as 
a  hashtable.  In  this  case,  the  /ADT  forwards  slot  access 
requests  to  the  appropriate  data  structure.  The  ADT  in¬ 
stance  then  contains  a  pointer  to  the  data  structure  that 
holds  the  slot  values.  Finally,  a  hybrid  ADT  maintains 
both  local  and  remote  storage. 

The  ADT  facility  provides  multiple  access  schemes  to 
support  the  integration  of  foreign  languages  and  modules, 
i.e„  systems  implemented  outside  of  the  ABE  environ¬ 


ment.  The  facility  lets  a  programmer  create  standard  in¬ 
terfaces  to  data  structures  in  the  foreign  module,  which 
other  ABE  modules  can  then  use.  Forwarding  ADTs 
provide  a  “window”  into  foreign  code;  if  slot  values 
change  from  inside  that  code  (i.e.,  not  through  ADT 
messages),  subsequent  ADT  accesses  will  return  the 
changed  values.  Forwarding  ADTs  also  provide  means  for 
creating  new  foreign  data  structures,  containing  a  given 
set  of  initial  slot  values. 

Conversely,  a  caching  ADT  provides  a  “snapshot”  into  a 
foreign  data  structure.  The  ADT  copies  all  slot  values  out 
of  the  foreign  structure  into  the  ADT  instance  when  the 
instance  is  created  or  when  a  slot  is  accessed  Tor  the  first 
time;  subsequent  changes  to  the  underlying  data  struc¬ 
ture  do  not  affect  the  ADT  instance.  The 

ASW-PREFERENCES-KEE  .ADT  defined  in  Figure  3-3  is  a  cach¬ 
ing  ADT  that  reads  initial  slot  values  out  of  a  KEE  Unit 
when  an  ADT  instance  is  created. 

The  hybrid  ADT  access  mechanism  provides  a  number 
of  different  capabilities.  It  can  support  selective  caching 
or  forwarding  for  particular  slots,  or  implement  “read- 
through  ropy-on-write”  access  to  the  foreign  data  struc¬ 
ture.  Certain  applications  may  require  one  or  more  types 
of  slot  access  methods. 

Figure  4-1  illustrates  the  structure  of  an  ADT,  which 
contains  a  pointer  to  an  external  data  structure  and/or 
local  storage.  The  figure  shows  the  use  of  KEE  Units  and 
NIKL*  concepts  as  storage  mchanisms  for  /ADTs. 

4.2.1.  Defining  New  Storage  Methods 

In  order  to  create  a  new  ADT  storage  method,  the 
programmer  needs  to  inform  the  ADT  facility  how  to  ac¬ 
cess  an  external  data  structure.  The  ADT  facility  sup¬ 
ports  a  simple  model  of  data  structures  interfaces,  con¬ 
sisting  of  a  number  of  standard  functional  capabilities 
such  as  adding  a  new  value  lo  a  given  slot.  The 
programmer  must  supply  a  function  for  each  capability. 
With  these  functions,  the  ADT  facility  compiles  a  now 
storage  method  automatically. 

Each  .ADT  storage  methods  must  provide  an  implemen¬ 
tation  Tor  each  of  these  capabilities: 

•  Add  a  new  value  lo  a  set  of  values. 

•  Delete  a  given  value  from  a  set  of  values. 

•  Replace  all  values  with  a  single  new  value. 

\1KI.  is  a  knowledge  representation  language  developed  bv 

t  sr  1st. 
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•  Replace  all  values  with  a  set  of  new  values. 

•  Return  a  single  slot  value. 

•  Return  all  slot  values  as  a  list. 

The  ADT  facility  uses  these  supplied  capabilities  to  build 
the  responses  to  access  and  update  an  ADT  instance's 
slots  and  links.  In  addition,  an  ADT’s  physical  definition 
must  specify  the  type  of  remote  slot  accesses,  e.g .,  copy- 
on-write,  etc. 

Figure  4-2  shows  the  definition  of  one  of  the  interface 
functions  for  the  KEE  ADT  interface.  This  function 
implements  the  “add  a  value  to  a  slot"  capability  by  call¬ 
ing  the  corresponding  KEE  function  add. VALUE.  We 
define  the  capability  as  a  macro  for  performance  con¬ 
siderations. 


(def macro  ADD-KEE-ADT  (adt  slot  new) 

"Add  a  new  value  to  a  KEE  ADT  slot.” 

•  (kee : : add . value  (Kee  Unit  adt)  .slot  ,new)) 

Figure  4-2:  ADT  Function  to  add  a  slot  value  for 
KEE  Units 


5.  Summary 

Intelligent  systems  are  characterized  by  their  need  to 
solve  large-scale  problems,  to  support  the  reuse  of 
software  components,  to  have  intelligible  definitions  and 
behavior,  and  to  integrate  with  their  surrounding  en¬ 
vironment.  ABE  is  a  new  generation  software  system 
that  supports  the  construction  of  intelligent  systems.  It 
provides  a  general-purpose  software  architecture  for 
building  intelligent  systems  based  on  its  Module-Oriented 
Programming  methodology. 

Two  of  the  basic  concepts  of  Module- Oriented 
Programming  are  modules  and  abstract  datatypes. 
Modules  provide  the  basic  building  blocks  used  to  con¬ 
struct  intelligent  systems.  Primitive  modules  allow  a 
module  builder  to  define  a  new  module  using  an 
ABE-supported  language  such  as  Common  LISP  or  KKI.'l. 
Importer  modules  are  a  special  case  of  primitive  mod'  !es 
that  allow  a  module  builder  to  import  a  piece  of  for  ign 
code  and  make  it  look  like  an  ABE  module. 

Abstract  datatypes  nermit  a  module  or  system  builder 
to  describe  his  data  at  a  high  level  without  worrying 
about  the  implementation.  Conversely,  it  also  allows  a 
user  to  create  interfaces  to  foreign  languages  and  code. 
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Abstract 

This  paper  discusses  DARPA-sponsored  research  in 
the  Experimental  Knowledge  Systems  Laboratory  at 
the  University  of  Massachusetts.  The  work  focuses 
on  the  control  of  reasoning  under  uncertainty.  This 
paper  includes  reports  on  the  MU  architecture,  the 
role  of  task-level  architectures  for  knowledge  acqui¬ 
sition,  a  task-level  approach  to  acquiring  local  com¬ 
bining  functions,  and  a  language  for  describing  the 
kind  of  short-term  contingency  plans  that  seem  ap¬ 
propriate  for  reasoning  under  uncertainty.  The  paper 
also  describes  evaluations  of  the  GRANT  system  — 
an  approach  to  information  retrieval  based  on  partial 
matching. 

I.  Introduction 

This  is  the  annual  report  for  1986-1987  on  our  DARPA- 
sponsored  research.  Our  work  over  the  last  year  has  been 
concerned  with  the  control  of  problem  solving  under  un¬ 
certainty.  Sources  of  uncertainty  —  noisy  data,  incomplete 
knowledge,  and  imprecision  about  the  effects  of  actions  — 
render  standard  planning  and  problem-solving  approaches 
useless.  These  approaches  depend  on  complete  knowledge 
of  the  states  of  the  world  and  complete  prior  knowledge 
of  the  effects  of  actions,  neither  of  which  can  be  guaran¬ 
teed  in  uncertain  environments.  But  problem  solvers  must 
act,  nonetheless.  Our  research  addresses  the  issue  of  how 
problem-solvers’  control  strategies  —  their  mechanisms  for 
focus  of  attention,  control  of  inference,  and  ft '  ol  of  ac¬ 
tions  —  can  affect  the  efficiency  with  which  they  solve 
problems,  where  efficiency  measures  roughly  the  tradeoff 
between  costs  and  certainty. 

We  have  been  building  an  architecture  called  MU  for 
problem-solvng  under  uncertainty.  MU  is  a  generalization 
of  our  earlier  work  on  managing  uncertainty  in  medicine, 
described  in  our  previous  annual  report.  Section  2  of  this 
paper  describes  MU  in  some  detail.  MU  is  meant  to  facil¬ 
itate  the  transfer  of  strategic  and  tactical  expertise,  thus 
much  of  our  effort  has  gone  into  building  tools  for  this  pur¬ 
pose.  Section  3  describes  the  intellectual  basis  for  this  work 
it  is  closely  related  to  Chandrasekaran’s  ideas  about 
generic  tasks  (1986).  The  position  described  in  Section  3 
sets  the  tone  for  our  research  (particularly  for  a  PhD  thesis 
on  automatic  acquisition  of  control  strategies  —  to  be  dis¬ 


cussed  in  the  next  annual  report).  The  impact  of  this  per¬ 
spective  is  seen  in  Section  4,  where  we  discuss  approaches 
to  acquiring  knowledge  about  combinations  of  evidence. 
Here,  we  contrast  the  global  combining  functions  typically 
used  in  expert  systems  with  the  local  functions  developed 
for  MU.  Section  4  discusses  the  strengths  and  weaknesses 
of  each  approach  and  settles  on  a  hybrid  approach. 

MU  makes  it  easy  to  acquire  and  make  explicit  the 
control  features  on  which  control  decisions  depend.  Sim¬ 
ilarly,  it  is  easy  to  specify  the  functions  that  keep  these 
features  current,  and  to  query  their  values  during  problem¬ 
solving.  But  as  yet,  these  abilities  have  not  been  fully  ex¬ 
ploited  by  the  strategic  component  of  our  problem  solvers. 
This  is  because  we  have  yet  to  develop  an  acceptable  rep¬ 
resentation  for  strategies  and  tactics.  We  have  adopted 
meta-rules  (Davis  and  Buchanan,  1984)  to  express  pref¬ 
erences  between  actions.  This  leads  to  an  iterative  view 
of  problem-solving,  wnere  the  problem-solver  stops  after 
every  action  to  check  its  effects  before  selecting  the  next 
action.  Very  recently,  we  have  developed  a  representa¬ 
tion  for  short  contingency  plans,  which  should  allow  the 
problem-solver  to  plan  a  sequence  of  actions,  each  contin¬ 
gent  on  the  outcomes  of  previous  ones.  This  is  discussed 
in  Section  5. 

Although  much  of  our  work  is  concerned  with  MU,  we 
also  performed  an  exhaustive  series  of  tests  on  the  GRAN'J 
system  (discussed  in  the  previous  annual  report).  I  he  re¬ 
sults  of  these  evaluations  are  discussed  in  Section  6. 

This  report  is  culled  from  five  papers  and  reports: 

Paul  Cohen,  Michael  Greenberg,  and  Jefferson  DeLisio. 
1987.  MU:  A  Development  Environment  for  Prospec¬ 
tive  Reasoning  Systems.  AAAI-87,  July,  1987. 

Thomas  Gruber  and  Paul  Cohen.  1987.  Knowledge  Engi¬ 
neering  Tools  at  the  Architecture  Level.  International 
Joint  Conference  on  Artificial  Intelligence  (IJCAI- 
87),  August,  1987. 

Paul  Cohen,  Glenn  Shafer,  and  Prakasli  Shenoy.  1987. 
Modifiable  Combining  Functions.  EKSL  Report  87- 
OS,  Department  of  Computer  and  Information  Sci 
ence.  University  of  Massachusetts,  Amherst,  MA. 


Jefferson  DeLisio,  1987.  A  Notation  for  Representing 
Strategies.  EKSL  Report  87-08,  Department  of  Com¬ 
puter  and  Information  Science,  University  of  Mas¬ 
sachusetts,  Amherst,  MA. 

Paul  Cohen  and  Rick  Kjeldsen.  1987.  Information  Re¬ 
trieval  by  Constrained  Spreading  Activation  in  Se¬ 
mantic  Networks  Journal  of  Information  Processing 
and  Management,  Forthcoming. 

Other  publications  and  reports  for  this  year  are  listed 
below.  A  brief  annotation  for  each  describes  its  contents: 

Paul  Cohen,  David  Day,  Jeff  Delisio,  Michael  Green¬ 
berg,  Rick  Kjeldsen,  Daniel  Suthers  and  Paul 
Berman.  1986.  Managing  of  Uncertainty  in  Medicine. 
IEEE  Conference  on  Computers  and  Communica¬ 
tions.  February,  1987.  Also,  to  be  published  in  Int. 
Journal  of  Approximate  Reasoning, 

This  describes  the  MUM  system  as  it  was  imple¬ 
mented  in  1986.  We  have  since  reimplemented  it  in 
MU. 

Paul  R.  Cohen.  1987.  Steps  Towards  Programs  that 
Manage  Uncertainty.  EKSL  Report  87-06,  Depart¬ 
ment  of  Computer  and  Information  Science,  Univer¬ 
sity  of  Massachusetts,  Amherst,  MA. 

This  includes  a  detailed  analysis  of  diagnostic  strat¬ 
egy,  discusses  the  nature  of  problem  solving  under  un¬ 
certainty,  and  describes  the  transition  from  MUM  to 
MU. 

Thomas  Gruber  and  Paul  Cohen.  1986.  Knowledge  Engi¬ 
neering  Tools  at  the  Architecture  Level.  AAAI  Work¬ 
shop  on  High  Level  Tools,  October  7,  8. 

This  is  a  longer  version  of  the  1JCAI  paper,  cited 
above. 

Thomas  Gruber  and  Paul  R.  Cohen.  1986.  Principles 
of  Design  for  Acquisition.  1987.  3rd  IEEE  Confer 
ence  on  Artificial  Intelligence  Applications.  Orlando, 
Florida.  February,  1987. 

Thomas  Gruber  and  Paul  Cohen.  1986.  Design  for  Ac¬ 
quisition:  Principles  of  Knowledge  System  Design  to 
Facilitate  Kttowlege  Knowledge  Acquisition.  AAAI 
Knowledge  Acquisition  for  Knowledge- Based  Systems 
Workshop,  Banff,  Canada,  November  3-7,  1986.  A 
revised  version  of  this  paper  will  appear  in  The  In¬ 
ternational  Journal  of  Man-Machine  Studies,  Spring 
1987. 

These  papers  describe  three  principles  of  design  for 
task-level  architectures,  and  illustrate  them  with  ex¬ 
amples  from  MU  and  MUM.  The  principles  occassion- 
ally  conflict.  This  is  especially  apparent  in  the  context 
of  acquiring  strategic  knowledge  from  experts,  and  it 
has  led  to  a  PhD  thesis  on  automating  this  process. 
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II.  MU:  A  Development 
Environment  for  Prospective 
Reasoning  Systems 

MU  is  a  development  environment  for  knowledge  sys¬ 
tems  that  reason  with  incomplete  knowledge.  It  has 
evolved  from  a  program  called  MUM  that  planned  di¬ 
agnostic  sequences  of  questions,  tests,  and  treatments 
for  chest  and  abdominal  pain  (Cohen,  et  ai,  1986). 
This  task  is  called  prospective  diagnosis,  because  it 
emphasizes  the  selection  of  actions  based  on  their  po¬ 
tential  outcomes  and  the  current  state  of  the  patient. 
Prospective  diagnosis  is  uncertain  because  the  precise 
outcomes  of  actions  cannot  be  predicted,  in  part  be¬ 
cause  knowledge  of  the  state  of  the  patient  is  incom¬ 
plete.  Yet  we  have  found  that  physicians  have  rich 
strategic  knowledge  with  which  they  plan  diagnoses  in 
spite  of  their  their  uncertainty.  MU  does  not  provide 
a  knowledge  engineer  with  any  particular  strategies, 
but  rather  provides  an  environment  in  which  it  is  easy 
to  acquire,  represent,  and  experiment  with  a  wide  va¬ 
riety  of  strategies  for  prospective  diagnosis  and  other 
prospective  reasoning  tasks. 

Three  goals  underlie  our  research  and  motivate  the 
MU  system.  First,  MU  is  intended  to  provide 
knowledge-engineering  tools  to  help  acquire  expert 
problem-solving  strategies.  MU  allows  us  to  define  ex¬ 
plicit  control  features ,  which  are  the  terms  an  expert 
uses  to  discuss  strategies.  Control  features  in  medical 
diagnosis  include  degrees  of  belief  in  disease  hypothe¬ 
ses,  monetary  costs  of  evidence,  the  consequences  of 
incorrect  conclusions,  and  “intangibles”  such  as  anxi¬ 
ety  and  discomfort.  Some,  like  degrees  of  belief,  have 
values  that  change  dynamically  during  problem  solv¬ 
ing.  MU  helps  the  knowledge  engineer  define  the  func¬ 
tions  that  compute  these  dynamic  values  and  keeps 
the  values  accessible  during  problem  solving.  For  ex¬ 
ample,  with  MU  we  can  easily  define  a  control  feature 
called  criticality  in  terms  of  two  others,  say  danger¬ 
ousness  and  degree  of  belief,  and  acquire  a  function 
for  dynamically  assessing  the  criticality  of  a  hypothe¬ 
sis  as  its  degree  of  belief  changes. 

Second,  we  want  to  show  that  strategies  enable  a 
prospective  reasoning  system  to  produce  solutions 
that  are  efficient,  in  the  sense  of  minimizing  the  costs 
of  attaining  given  levels  of  certainty  MU  has  no  “built 
in”  problem  solving  strategies,  but  we  have  been  able 
to  acquire  and  implement  efficient,  expert  strategies 
in  MU  because  we  can  define  explicit  control  features 
that  represent  the  various  costs  of  actions,  as  well  as 
the  levels  of  certainty  in  the  evidence  produced  by 
actions. 

Third,  we  want  to  implement  in  MU  a  task-level  ar¬ 
chitecture  for  prospective  reasoning  (Gruber  and  Co¬ 
hen,  1987),  an  environment  for  building  systems  that 
plan  efficient  sequences  of  actions,  despite  uncertainty 


about  their  outcomes.  After  working  in  the  domains 
of  medicine  and  plant  pathology,  we  now  think  that 
many  control  features  pertain  to  diagnostic  tasks  in 
general.  Moreover,  diagnosticians  in  many  fields  seem 
to  use  similar  strategies  to  solve  problems  efficiently. 
This  view  is  influenced  by  the  recent  trend  in  AI  to¬ 
ward  defining  generic  tasks  (Chandrasekaran,  1986) 
such  as  classification  (Clancey,  1985)  and  the  architec¬ 
tures  that  support  their  implementation.  MU  shares 
the  orientation  toward  explicit  control  efforts  such  as 
BB*  (Hayes- Roth,  1985;  Hayes-Roth,  et  ai,  1986)  and 
Heracles  (Clancey,  1986),  but  emphasizes  control  fea¬ 
tures  that  are  appropriate  for  prospective  reasoning. 
In  sum,  MU  is  a  tool  for  reprerenting  and  providing  ac¬ 
cess  to  the  knowledge  that  underlies  efficient  prospec¬ 
tive  reasoning.  This  report  begins  with  an  analysis  of 
prospective  reasoning,  then  describes  the  MU  environ¬ 
ment  first  as  a  program,  emphasizing  its  structure  and 
function,  then  from  the  perspective  of  the  knowledge 
engineer  who  uses  it.  As  an  illustration,  we  describe 
how  MUM  was  reimplemented  in  MU.  We  conclude 
with  a  summary  of  current  work. 

A.  Prospective  Reasoning 

Prospective  reasoning  is  reasoning  about  the  question 
“What  shall  1  do  next,”  given  that 

1.  knowledge  about  the  current  state  of  the  world  is 
incomplete, 

2.  the  outcomes  of  actions  are  uncertain, 

3.  there  are  tradeoffs  between  the  costs  of  actions 
with  respect  to  the  problem  solver’s  goals  and 
the  utility  of  the  evidence  they  provide, 

4.  states  of  knowledge  that  result  from  actions  can 
influence  the  utility  of  other  actions. 

An  example  from  medical  diagnosis  illustrates  these 
characteristics: 

A  middle-aged  man  reports  episodes  of 
chest  pain  that  could  be  either  angina  or 
esophageal  spasm;  the  physician  orders  an 
EKG,  but  it  provides  no  evidence  about  ei¬ 
ther  hypothesis;  then  he  prescribes  a  trial 
prescriotion  of  vasodilators;  the  patient  has 
no  further  episodes  of  pain,  so  the  physician 
keeps  him  on  long-acting  vasodilators  and 
eventually  suggests  a  modified  stress  test  to 
gauge  the  patient’s  exercise  tolerance. 

The  first  and  second  characteristics  of  prospective 
reasoning  are  clearly  seen  in  this  case:  Knowledge 
about  the  state  of  the  patient  is  incomplete  through¬ 
out  diagnosis,  and  the  outcomes  of  actions  (the  EKG, 
trial  therapy,  stress  test)  are  uncertain  until  they  are 
performed  and  are  sometimes  ambiguous  afterwards. 
Less  obvious  is  the  third  characteristic,  the  tradeoffs 
inherent  in  each  action.  Statistically,  an  EKG  is  not 
likely  to  provide  useful  evidence,  but  if  it  does,  the  e.i- 
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dence  will  be  completely  diagnostic.  The  EKG  is  given 
because  its  minimal  costs  (e.g.,  time,  money,  risk,  and 
anxiety)  are  offset  by  the  possibility  of  obtaining  di¬ 
agnostic  evidence1.  Similarly,  trial  therapy  satisfies 
many  goals;  it  protects  the  patient,  costs  little,  has 
few  side-effects  and,  if  successful,  is  good  evidence  for 
the  angina  hypothesis. 

The  fourth  characteristic  of  prospective  reasoning  is 
that  states  of  knowledge  that  result  from  actions  can 
affect  the  utility  other  actions.  This  is  because  the 
costs  and  benefits  of  actions  are  judged  in  the  context 
of  what  is  already  known  about  the  patient.  For  ex¬ 
ample,  trial  therapy  is  worthwhile  if  the  EKG  does  not 
produce  diagnostic  evidence,  but  is  redundant  other¬ 
wise.  The  outcome  of  an  EKG  thus  affects  the  utility 
of  trial  therapy.  This  implies  a  dependency  between 
the  actions,  and  suggests  a  strategy:  do  the  EKG  first 
because,  if  it  is  positive,  then  trial  therapy  will  be 
unnecessary. 

Dependencies  between  actions  help  the  prospective 
reasoner  to  order  actions.  We  call  this  planning, 
though  it  is  not  planning  in  the  usual  AI  sense  of  the 
word  (Sacerdoti,  1979;  Cohen  and  Feigenbaum,  1982) 
The  differences  are  due  to  the  first  and  second  char- 
icteristics  of  prospective  reasoning:  the  state  of  the 
world  and  the  effects  of  actions  are  both  uncertain. 
The  prospective  planner  must  “feel  its  way”  by  esti¬ 
mating  the  likely  outcomes  of  one  or  more  actions,  ex¬ 
ecuting  them,  then  checking  whether  the  actual  state 
of  the  world  is  as  expected.  Plans  in  prospective  rea¬ 
soning  tend  to  be  short.  In  contrast,  uncertainty  is 
excised  from  most  AI  planners  by  assuming  that  the 
initial  state  of  the  world  and  the  effects  of  all  actions 
are  completely  known  (e.g.,  the  STRIPS  assumption, 
Fikes,  Hart,  Nilsson,  1972).  AI  planners  can  proceed 
by  “dead-reckoning,”  because  it  follows  from  these  as¬ 
sumptions  that  every  state  of  the  world  is  completely 
known.  All  further  discussions  of  planning  in  this  re¬ 
port  refer  to  the  “feel  your  way”  variety,  not  to  “dead 
reckoning.” 

Prospective  diagnosis  requires  a  planner  to  select  ac¬ 
tions  based  on  their  costs  and  utility  given  the  cur¬ 
rent  state  of  knowledge  about  the  patient.  We  have 
described  prospective  reasoning  as  planning  because 
the  evidence  from  one  action  may  affect  the  utility 
of  another.  Alternatively,  prospective  reasoning  ran 
be  viewed  as  a  series  of  decisions  about  actions,  each 
conditioned  on  the  current  state  of  knowledge  about 
the  patient.  We  considered  decision  analysis  (Raiffa, 
1970;  Howard,  1906)  as  a  mechanism  for  selecting 
actions  in  prospective  reasoning,  but  rejected  it  for 
two  reasons.  First,  collapsing  control  features  such  as 
monetary  expense,  time,  and  criticality  into  a  single 
measure  of  utility  negates  our  goals  of  explicit  control 


’This  example  oversimplifies  the  reasons  for  giving  an  EKG,  but 
not  the  cost/benefit  analysis  that  underlies  the  decision. 
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and  providing  a  task-level  architecture  for  prospective 
reasoning  (Cohen,  1985;  Gruber  and  Cohen,  1987). 
Second,  decision  analysis  requires  too  many  numbers 
—  a  complete,  combinatorial  model  of  each  decision. 
The  expected  utility  of  each  potential  action  can  only 
be  calculated  from  the  joint  probability  distribution 
of  the  possible  outcomes  of  the  previous  actions.  But 
although  we  do  not  implement  prospective  reasoning 
with  decision  analysis,  MU  is  designed  to  provide  qual¬ 
itative  versions  of  several  decision-analytic  concepts, 
including  the  utility  of  evidence  and  sensitivity  anal¬ 
ysis. 


B.  The  MU  Environment  -  An 
Overview 

A  coarse  view  of  MU’s  structure  reveals  these  compo¬ 
nents: 

•  a  frame-based  representation  language, 

•  tools  for  building  inference  networks, 

•  an  interface  for  defining  control  features  and  the 
functions  that  maintain  their  values, 

•  a  language  for  asking  questions  about  the  state 
of  a.  problem  and  how  to  change  its  state. 

•  a  user  interface  for  acquiring  data  during 
problem-solving, 

With  these  tools,  a  knowledge  engineer  ran  build  a 
knowledge  system  with  a  planner  for  prospective  rea¬ 
soning.  MU  does  not  “come  with”  any  particular  plan¬ 
ners,  but  it  provides  tools  for  building  planners  and  in¬ 
corporating  expert  problem-solving  strategies  within 
them. 

Among  MU’s  tools  is  an  editor  for  encoding  domain 
inferences,  such  as  if  EKG  shows  ischemic  changes 
then  angina  is  confirmed,  in  an  inference  network. 
MU  does  not  dictate  what  the  nodes  in  the.  infer¬ 
ence  network  should  represent,  except  in  the  weak 
sense  that  nodes  “lower”  in  the  network  relative 
to  the  direction  of  inference  provide  evidence  for 
those  “higher”  up.  However,  the  nodes  in  the  net¬ 
work  are  usually  differentiated;  for  example,  in  Fig¬ 
ure  1  some  nodes  represent  raw  data,  others  repre¬ 
sent  combinations  of  data  (called  clusters),  and  a  third 
class  represents  hypotheses.  In  the  medical  domain, 
data  nodes  represent  individual  questions,  tests,  or 
treatments  Clusters  combine  several  data;  for  exam¬ 
ple,  the  nsk-factors-for  angina  cluster  combines  the 
patient’s  blood  pressure,  family  history,  past  medical 
history,  gender,  and  so  on.  Hypothesis  nodes  repre¬ 
sent  diseases  such  as  angina. 

Since  MU  does  not  provide  a  planner,  the  knowledge 
engineer  is  required  to  build  one.  The  planner  should 
answer  two  questions: 
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An  Inference  Net  in  MU 


MU  System 


Figure  1:  Organization  of  Knowledge  Within  Mu 


•  Which  node(s)  in  the  network  should  be  in  the 
focus  set,  and  which  of  these  should  he  the  im¬ 
mediate  focus  of  attention? 

•  Which  actions  are  applicable,  given  the  focus  set, 
and  which  of  these  should  be  taken? 

For  example,  in  the  medical  domain  the  focus  set 
might  include  all  disease  hypotheses  that  have  some 
support,  and  the  immediate  focus  of  attention  might 
be  the  most  dangerous  one.  The  potential  actions 
might  be  the  leaf  nodes  of  the  tree  rooted  at  the  focus 
of  attention  (Fig.  1),  and  the  selected  action  might  be 
the  cheapest  of  the  potential  actions. 

MU  provides  an  interface  to  help  the  knowledge  engi¬ 
neer  define  control  features  such  as  the  degree  of  belief 
in  hypotheses,  the  dangerousness  of  diseases,  and  the 
costs  of  diagnostic  actions.  It  also  provides  a  language 
with  which  a  planner  can  query  the  values  of  features 
and  ask  about  actions  that  would  change  those  values. 
Planners  can  ask,  for  example,  “What  is  the  current 
level  of  belief  in  angina?”  or  “Tell  me  all  the  inexpen¬ 
sive  ways  to  increase  the  level  of  belief  in  angina,”  or 
even  the  hypothetical  question,  “Would  the  level  of 
belief  in  angina  change  if  blood  pressure  was  high?” 

The  relationship  between  these  functions  of  MU  and 
the  functions  of  a  planner  are  shown  in  Figure  2.  Us¬ 
ing  MU,  a  knowledge  engineer  can:  define  a  control 
feature  such  as  criticality  in  terms  of  other  features 
such  as  dangerousness  and  degree  of  belief;  specify 
a  combining  function  for  calculating  dynamically  the 
value  of  criticality  from  these  other  features  during 
problem  solving;  associate  criticality  and  its  combin¬ 
ing  function  with  a  class  of  nodes,  such  as  diseases, 


Figure  2:  Mu  System  Schematic 

and  have  each  member  of  the  class  inherit  the  def¬ 
initions;  and  write  a  planner  that  encodes  an  expert 
strategy  for  dealing  with  critical  or  potentially-critical 
diseases.  MU  facilitates  the  development  of  planners, 
and  makes  their  behavior  explicit  and  efficient,  but 
the  design  of  planners,  and  the  acquisition  of  strate¬ 
gies  and  the  control  features  on  which  they  depend,  is 
the  job  of  the  knowledge  engineer. 

C.  The  MU  Environment  -  Features 
and  Combining  Functions 

Knowledge  representation  in  MU  centers  around  fea¬ 
tures.  Features  and  their  values  are  the  information 
with  which  planning  decisions  are  made.  Each  node 
in  a  MU  inference  network  can  have  several  features; 
for  example,  the  node  that  represents  trial  therapy  for 
angina  includes  features  for  monetary  cost  and  risk  to 
the  patient.  Features  are  defined  in  the  normal  course 
of  knowledge  engineering  to  support  expert  strategies 
for  prospective  reasoning.  We  have  identified  four 
classes  of  features,  differentiated  by  their  value  types, 
how  they  are  calculated,  and  the  operations  that  MU 
can  perform  on  them: 

Static  The  value  of  a  static  feature  is  specified  by  the 
expert  and  does  not  change  at  run  time.  Mone¬ 
tary  cost  is  a  typical  static  feature,  as  the  cost  of 
an  action  does  not  chang°  during  a  session. 
Datum  The  value  of  a  datum  feature  is  acquired  at 
run  time  by  asking  the  user  questions.  Data  are 
often  the  results  of  actions;  for  example  EKG 
shows  ischemic  changes  is  a  potential  result  of 
performing  an  EKG. 

Dynamic  The  value  of  a  dynamic  feature  is  com¬ 
puted  from  the  values  of  other  feature  values  in 
the  network.  The  value  of  each  dynamic  feature 
is  calculated  by  a  combining  function,  acquired 
through  knowledge  engineering.  A  dynamic  fea- 
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ture  of  every  hypothesis  is  its  degree  of  belief  — 
a  function  of  the  degrees  of  belief  of  its  evidence, 
x  „eus  The  value  of  a  focus  feature  is  a  set  of  nodes 
whose  features  satisfy  a  user-defined  predicate. 
Focus  features  are  a  subclass  of  dynamic  features. 
In  medicine,  the  differential  focus  feature  can  be 
defined  as  the  list  of  all  triggered  hypotheses  that 
are  not  confirmed  or  disconfirmed. 

Feature  values  can  belong  to  several  data  types,  in¬ 
cluding  integers,  sets,  normal  (one  of  ail  unordered 
set  of  possible  values),  ordinal  (one  of  an  ordered  set 
of  possible  values),  boolean,  and  relational  (e.g.,  isa). 

Four  operations  are  defined  for  features:  one  can  set 
a  feature  value  (e.g.,  assert  that  the  monetary  cost  of 
a  test  is  high)  get  a  feature  value  (e.g.,  ask  for  the 
cost  of  a  test),  ask  how  to  change  a  feature  value,  and 
ask  what  are  the  effects  of  changing  a  feature  value. 
Planners  need  answers  to  these  kinds  of  questions  to 
help  them  select  actions  (see  Section  5  for  further  ex¬ 
amples.) 

All  combinations  of  feature  type,  value  type,  and  op¬ 
erations  are  not  possible.  Figure  3  summarizes  the 
legal  combinations. 

MU  provides  an  interface  for  defining  features.  A 
full  definition  includes  the  feature  type,  value  type, 
its  range  of  values,  and  the  domain  of  its  combin¬ 
ing  functions.  For  instance,  the  dynamic  feature 
level  of  support  is  defined  to  have  seven  values  on 
an  ordinal  scale:  disconfirmed,  strongly-detracted,  de¬ 
tracted,  unknown,  supported,  strongly-supported  and 
confirmed.  Figure  4  shows  the  definition  of  level  of 
support. 

Instances  of  this  feature  (and  others)  are  associated 
with  individual  hypotheses,  each  of  which  may  have 
its  own,  local  function  for  calculating  level  of  support, 
and  its  own,  dynamic  value  for  the  feature2.  For  ex¬ 
ample,  Figure  5  shows  part  of  the  frame  for  the  angina 
hypothesis,  encompassing  an  instance  of  the  level  of 
support  feature,  and  showing  a  fragment  of  the  func¬ 
tion  for  calculating  its  value  for  angina. 


Level-Of- Support 

Feature-type:  Dynamic 
Value- Type:  Ordinal 
Value-restriction: 

(disconfirmed  strongly- detracted  detracted 
unknown  supported  strongly-supported 
confirmed) 

Value:  the  current  level  of  support  of  the 

hypothesis 

Combination-function-slot:  local  to 
each  hypothesis 

Figure  4:  Definition  of  Level-Of-Support 


Combining  functions  calculate  values  for  dynamic  fea¬ 
tures  such  as  level  of  belief,  criticality,  elapsed  time, 
and  so  on.  They  serve  two  important  functions:  First, 
they  keep  the  state  of  MU’s  inference  network  up-to- 
date;  for  example,  when  the  result  of  an  EKG  becomes 
available,  the  combining  function  for  the  angina  node 
updates  the  value  of  its  level  of  support  feature  ac¬ 
cordingly. 

Angina 

Feature-list:  (level-of-support  severity) 

Current-level-of-support: 

strongly-supported 

Combination-function: 

IF  value  of  ekg  is  ischemic-changes 
THEN  angina  is  confirmed 
ELSEIF  episode-incited-by  contains  exertion 
AND 

risk-factors-for-angina  are  supported 
THEN  angina  is  strongly-supported  .  .  . 

Figure  5:  Part  of  the  Angina  Frame  With  Local  Combining 
Function 
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Data  Types 

Questions 

Feature 

Number  Sol  Ordinal  Normal 

Cot  Set  How  To  Effect  Of 

static 

datum 

dynamic 

focus 

XXX  X 

XXX  X 

X  XX 

X 

X 

XX  X 

X  X 

X 

Figure  3:  Capabilil.es  My  Feature  Type 


2Not,  all  feature  values  are  calculated  locally,  but,  for  reasons  dis¬ 
cussed  in  Cohen,  Shafer,  and  Sl.enoy  (1987)  and  Cohen,  rt  at,  (1986) 
levels  of  belief  are 
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Second,  and  perhaps  more  important  from  the  stand¬ 
point  of  a  planner,  combining  functions  provide  a 
prospective  view  of  the  effects  of  actions;  for  example, 
the  combining  function  for  angina  can  be  interpreted 
prospectively  to  say  that  EKG  can  potentially  con¬ 
firm  angina.  The  same  point  holds  for  the  combining 
functions  for  other  features:  MU  can  prospectively 
assess  the  potential  effects  of  actions  on  all  dynamic 
features.  A  planner  can  ask  MU,  “If  EKG  is  negative, 
what  changes?”  and  get  back  a  list  of  all  the  features 
of  all  data,  clusters,  and  hypotheses  that  are  in  some 
way  affected  by  the  value  of  EKG.  The  effects  of  ac¬ 
tions  are  assessed  in  the  context  of  Mil's  current  state 
of  knowledge  (i.e.,  the  state  of  its  network).  For  ex¬ 
ample,  if  an  EKG  has  been  given  and  its  results  were 
negative,  then  MU  knows  that  the  answer  to  the  pre¬ 
vious  question  is  that  nothing  changes. 

The  syntax  of  combining  functions  is  relatively  unim¬ 
portant  provided  they  are  declarative,  so  MU’s 
question-answering  interface  can  read  them,  and  ex¬ 
perts  can  easily  specify  and  modify  them.  Currently, 
combining  functions  look  like  rules,  but  we  are  ex¬ 
perimenting  with  tabular  and  graphic  forms  (Cohen, 
Shafer,  and  Shenoy,  1987). 

The  two  major  classes  of  combining  funct  ions  are  local 
and  global.  A  local  function  for  a  node  such  as  angina 
refers  only  to  the  nodes  in  the  inference  network  that 
are  directly  connected  to  angina.  In  contrast,  global 
functions  survey  the  state  of  MU’s  entire  inference 
network.  Functions  for  focus  features  take  a  global 
perspective  because  the  value  of  a  focus  feature  is  the 
subset  of  nodes  in  the  network  whose  features  satisfy 
some  predicate.  For  example,  Figure  6  illustrates  the 
combining  function  for  the  differential  focus  feature. 
Any  node  that  represents  a  disease  hypothesis,  and  is 
triggered,  but  is  neither  confirmed  nor  disconfirmed  is 
a  member  of  the  differential. 


Differential 

feature-list:  (focus-feature) 

current-focus:  (angina  prinz-inctal  ulcer) 

combining-function: 

Set-of  $node$  member-of  disease  Such-that 
$node$  is  triggered  AND 
level-of-support  of  $node$  is  not  con¬ 
firmed  AND 

level-of-support  of  $node$  is  not  discon¬ 
firmed 

Figure  6:  Part  of  the  Global  Focus-Feature  Differential 


The  knowledge  engineer  can  define  many  focus  fea¬ 
tures,  each  corresponding  to  a  class  of  nodes  that  a 
planner  may  want  to  monitor.  Besides  the  differential, 
a  planner  might  maintain  the  set  of  critical  hypothe¬ 
ses  (e.g.,  all  dangerous  hypotheses  that  have  moder¬ 
ate  support  or  better),  or  the  set  of  hypotheses  that 
have  relatively  high  prior  probability,  or  the  set  of  all 
supported  clusters  that  potentially  confirm  a  particu¬ 
lar  hypothesis.  MU  supports  set  intersection,  union, 
and  sorting  on  the  sets  of  nodes  maintained  by  focus 
features.  A  planner’s  current  focus  of  attention  is  rep¬ 
resented  in  terms  of  the  results  of  these  operations. 

D.  MU  from  the  Knowledge  Engi¬ 
neer’s  Perspective 

MU  is  a  development  environment  for  prospective  rea¬ 
soning  systems.  We  began  our  research  on  prospective 
reasoning  when  we  were  building  a  system  MUM,  for 
prospective  diagnosis  (Cohen, et  al.,  1986),  and  real¬ 
ized  that  we  lacked  the  knowledge  engineering  tools 
to  acquire  and  modify  diagnostic  strategies.  An  ex¬ 
ample  will  illustrate  the  knowledge  engineering  issues 
in  building  MU: 

MUM  had  several  strategic  phases ,  each  of  which  spec¬ 
ified  how  to  assess  a  focus  of  attention  and  select  an 
action.  One  phase,  called  initial  assessment,  directed 
MUM  to  focus  on  triggered  hypotheses  one  by  one 
and  take  inexpensive  actions  that  potentially  support 
each.  This  covered  a  wide  range  of  situations,  and 
maintained  the  efficiency  of  diagnoses  by  focusing  on 
low-cost  evidence,  but  it  made  little  sense  for  very 
dangerous  disease  hypotheses.  For  these,  diagnostic- 
ity  —  not  cost  —  is  the  most  important  criterion 
for  selecting  actions.  Once  the  expert  explained  this, 
we  should  have  immediately  added  new  strategic 
phase,  run  the  system,  and  iterated  if  its  performance 
was  incorrect.  Unfortunately,  control  features  such  as 
criticality  and  diagnosticity  did  not  have  declarative 
representations  in  MUM,  were  implemented  in  lisp, 
and  could  not  easily  be  composed  from  other  control 
features.  Operations  such  as  sorting  a  list  of  critical 
hypotheses  by  their  level  of  support  were  also  imple¬ 
mented  in  lisp.  Each  strategic  phase  required  a  day  or 
two  to  write  and  debug.  From  the  standpoint  of  the 
expert,  it  was  an  unacceptable  delay. 

The  MUM  project  showed  us  that  MU  should  facili¬ 
tate  acquisition  of  control  features,  maintain  their  val¬ 
ues  efficiently,  and  support  a  broad  range  of  questions 
about  the  stale  of  the  inference  network.  MU  allows 
a  planner  to  ask  6  classes  of  questions: 

Questions  about  state  are  concerned  with  the  current 
values  of  features.  For  example: 

Ql:  “  What  is  the  current  level  of  support  for 
angina?” 

Q2:  “Is  an  ulcer  dangerous?” 

Q3:  “What  is  the  cost  of  performing  an  angiogram?” 
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Another  class  of  questions  is  asked  to  find  out  how 
to  achieve  a  goal.  Examples  of  questions  about  goals 
are: 

Q4:  “Given  what  I  know  now,  which  tests  might  con¬ 
firm  angina?” 

Q5:  “What  are  all  of  the  tests  that  might  have  some 
bearing  on  heart  disease?” 

These  questions  help  a  planner  identify  relevant  ac¬ 
tions  and  select  among  them.  Those  that  pertain  to 
levels  of  belief  are  answered  by  refering  to  the  appro¬ 
priate  combining  functions  and  current  levels  of  belief. 
Fo-  example,  the  answer  to  the  question  about  angina 
is  “EKG,”  if  an  EKG  has  not  already  been  performed 
(Fig.  5). 

Questions  about  the  effects  of  actions  allow  a  plan¬ 
ner  to  understand  the  ramifications  of  an  action.  For 
example, 

Q6:  “  Which  disease  hypotheses  are  affected  by  per¬ 
forming  an  EKG? 

Q7:  “  What  are  the  possible  results  of  an  angiogram?” 
Q8:  “Does  age  have  an  effect  on  the  criticality  of 
colon  cancer?” 

MU  answers  these  questions  by  traversing  the  rela¬ 
tions  between  actions  and  nodes  “higher”  in  the  infer¬ 
ence  network.  For  example,  Q6  is  answered  by  finding 
all  the  nodes  for  which  EKG  provides  evidence.  The 
planner  may  ask  either  for  the  immediate  consequence 
of  knowing  EKG,  or  for  the  consequences  to  any  de¬ 
sired  depth  of  inference. 

Focus  questions  help  a  planner  establish  focus  of  at¬ 
tention.  For  example: 

Q9:  “Give  me  all  diseases  that  are  triggered  and  dan¬ 
gerous.” 

Q10:  “What  are  all  of  the  critical  diseases  for  which 
I  have  no  information?” 

Q 11:  “Are  any  hypotheses  confirmed?” 

Questions  about  multiple  effects  allow  the  planner 
to  combine  the  previous  question  types  into  more  com¬ 
plex  queries  such  as  “What  tests  can  discriminate  be¬ 
tween  angina  and  esophageal  spasm?”  In  this  case, 
the  term  discriminate  is  defined  to  mean  “simultane¬ 
ously  increase  the  level  of  belief  in  one  disease  and 
lower  it  in  an  other.” 

Hypothetical  questions  allow  the  planner  to  identify 
dependencies  among  actions.  For  example,  one  ran 
ask,  “Suppose  the  response  to  trial  therapy  is  positive. 
Now,  could  a  stress  test  still  have  any  bearing  on  my 
belief  in  angina?” 

With  the  ability  to  define  control  features  and  answer 
such  questions,  we  quickly  reimplemented  MUM’s 
strategic  phase  planner.  Most  of  the  effort  went  into 
adding  declarative  definitions  of  control  features  and 
their  combining  functions  to  MUM’s  medical  inference 
network 


E.  Conclusion 

MU  supports  the  construction  of  systems  that  have 
the  characteristics  of  prospective  reasoning  identified 
in  Section  2:  Prospective  reasoning  involves  answer¬ 
ing  the  question,  “What  shall  1  do  next,”  given  un¬ 
certainty  about  the  state  of  the  world,  the  effects  of 
actions,  tradeoffs  between  the  costs  and  benefits  of 
actions,  and  precondition  relations  between  actions. 
The  six  classes  of  questions,  discussed  above,  help 
planners  to  decide  on  courses  of  action  despite  un¬ 
certainty.  Questions  about  state  make  uncertainty 
about  hypotheses  explicit.  Hypothetical  questions 
and  questions  about  effects  make  uncertainty  about 
the  outcomes  of  actions  explicit.  Questions  about 
goals  and  multiple  effects  help  a  planner  identify 
the  tradeoffs  between  actions.  And  hypothetical 
questions  make  dependencies  between  actions  explicit. 

III.  Knowledge  Engineering 
Tools  at  the  Architecture  Level 

This  report  is  about  tools  for  knowledge  engineering 
at  the  architecture  level.  A  knowledge  system  archi¬ 
tecture  specializes  common  A1  problem-solving  tech¬ 
niques  to  a  particular  class  of  tasks.  Architectures 
provide  descriptions  of  particular  kinds  of  problem 
solving  (e.g.,  diagnosis  or  configuration)  at  a  concep¬ 
tual  level  that  is  above  the  implementation,  thus  mak¬ 
ing  clear  which  aspects  of  a  class  of  problems  are  in¬ 
trinsic  to  the  problem  and  which  are  artifacts  of  the 
implementation.  Architectures  are  partial  designs  in 
which  some  decisions  are  made  in  advance  to  support 
particular  task  characteristics.  For  example,  many 
medical  diagnosis  systems  first  interpret  data  bottom- 
up  to  find  “triggered”  disease  hypotheses,  then  set 
top-down  goals  of  acquiring  evidence  pro  and  con  the 
triggered  hypotheses.  This  “trigger/acquire  evidence” 
cycle  is  an  intrinsic  part  of  any  architecture  for  the 
class  of  medical  diagnosis  tasks,  though  it  might  be 
implemented  in  a  wide  variety  of  ways. 
Architecture-level  tools  for  knowledge  engineers  can 
improve  the  productivity  of  system  development  and 
knowledge  acquisition  because: 

•  By  supporting  the  abstraction  of  representational 
and  computational  primitives  at  the  architecture 
level,  they  permit  the  knowledge  engineer  and 
expert  to  cooperatively  develop  systems  using  a 
shared  language  of  architecture  constructs,  rather 
than  in  terms  of  the  underlying  implementation 


3l)ata  abstraction  and  related  methodologies  such  as  object- 
oriented  programming  are  well  established  software  engineering  tech¬ 
niques  for  reducing  the  complexity  of  large  programs  by  hiding  im¬ 
plementation  details  (Abelson  and  Suss>man,1985).  For  knowledge 
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•  They  can  incorporate  knowledge  about  the  ar¬ 
chitecture  to  facilitate  system  development  and 
knowledge  acquisition  (e.g.,  by  enforcing  con¬ 
straints  on  the  types  and  values  of  elements  in 
the  knowledge  base). 

The  idea  of  an  architecture  level  underlies  recent  work 
on  knowledge  systems.4  Chandrasekaran  and  his  col¬ 
leagues  have  identified  a  number  of  “generic  tasks” 
such  as  hierarchical  diagnosis  and  routine  design, 
and  have  developed  task-specific  representation  lan¬ 
guages  and  control  strategies  for  them  (Chandrasak- 
eran,  1986;  Bylander  and  Mittal,  1986;  Brown  and 
Chandrasakeran,  1984).  McDermott  and  colleagues 
have  produced  several  knowledge  systems  using  ar¬ 
chitectures  that  integrate  knowledge  acquisition  tools 
with  the  problemsolving  methods  (McDermott,  1983; 
Kahn,  et  al.,  1984;  Eshelman  and  McDermott,  1986; 
Marcus,  1987;  Kuan,  et  al.,  1987).  Clancey  has  de¬ 
scribed  in  detail  the  heuristic  classification  method 
embodied  in  the  HERACLES  architecture  (Clancey, 
1985,  1986).  Newell  (1982)  anticipated  much  of  this 
work  in  his  AAAI  President’s  Address  on  the  Knowl¬ 
edge  Level,  where  he  distinguished  the  knowledge  of 
an  intelligent  agent,  which  is  used  to  model  its  behav¬ 
ior,  from  the  knowledge  representation  that  describes 
how  the  knowledge  is  encoded  in  a  symbol  system. 

This  report  presents  an  analysis  of  the  role  of  knowl¬ 
edge  engineering  tools  at  the  architecture  level.  We 
describe  three  complementary  views  of  what  is  meant 
by  the  architecture  level,  and  illustrate  them  in  the 
context  of  MU,  an  architecture  for  systems  that  man¬ 
age  uncertainty  by  deciding  how  to  act.  We  show  how 
the  architecture-level  analysis  leads  to  a  hierarchical 
organization  of  knowledge  engineering  tools  to  sup¬ 
port  software  development  and  knowledge  acquisition 
for  MU  systems.  We  conclude  with  some  advantages 
of  this  approach  to  knowledge  engineering. 

A.  Three  views  of  the  architecture 
level 

Architectures  can  be  viewed  from  three  perspectives, 
each  which  suggests  roles  for  architecture-level  tools. 
First,  the  functional  view  presents  an  architecture  as 
?n  application  of  general  Al  techniques  to  suit  a  par¬ 
ticular  style  of  problem  solving.  One  might  say  that, 
functionally,  the  blackboard  architect  re  is  well-suited 
to  problems  with  noisy  data  and  multiple  sources  of 
evidence  (Erman,  et  al  1980;  Nii,  1986).  There  are 
architectures  for  simple  classification  (e.g.,  traversing 
decision  trees),  heuristic  classification  (e.g.,  HERA¬ 
CLES,  Clancey,  1986;  CSRL,  Bvlander  and  Mittal 

systems,  the  architecture  is  a  particularly  useful  level  of  abstrac¬ 
tion,  and  tools  to  support  it  reduce  the  inherent  complexity  of  large 
knowledge-based  programs  by  separating  the  representational  and 
computational  needs  of  the  problem  solving  task  from  implementa¬ 
tion  decisions 


1986),  constructing  configurations  (e.g.,  SALT,  Mar¬ 
cus,  1987;  COAST,  Bennett,  1986),  and  design  (e.g., 
CSRL,  Brown  and  Chandrasekaran,  1984;  DOMINIC, 
Howe  et  al.,  1986). 

The  second  perspective  is  structural',  an  architec¬ 
ture  is  a  partial  design  that  includes  specifications  of 
knowledge  representation  formalisms,  inference  mech¬ 
anisms,  and  control  strategies.  Many  of  these  struc¬ 
tural  components  are  available  from  commercially 
available  Al  programming  environments.  Architec¬ 
tures,  however,  are  not  arbitrary  combinations  of 
these  components,  but  “good”  combinations  designed 
by  the  knowledge  engineer  for  particular  tasks. 

A  third  view  of  an  architecture  is  that  it  defines  a  vir¬ 
tual  machine.  The  architecture  provides  a  language 
that  describes  the  behavior  of  a  system  in  terms  nat¬ 
ural  for  the  knowledge  engineer  and  expert.  For  ex¬ 
ample,  most  medical  diagnosis  systems  provide  some 
kind  of  support  for  triggering  -  making  particular  hy¬ 
potheses  “active"  when  certain  events  (typically  in¬ 
put  data)  occur.  To  the  expert,  triggering  might  cor¬ 
respond  to  “bringing  a  diagnosis  to  mind.”  A  pro¬ 
grammer  can  produce  the  effect  of  triggering  using 
implementation-level  primitives  (e.g.,  giving  triggered 
diseases  high  certainty  factors  or  agenda  priorities). 
But  terms  such  as  “trigger”  —  not  their  implemen¬ 
tation  —  are  the  medium  of  knowledge  engineering. 
Such  task-level  terms  promote  explanation  (Swartout, 
1983)  and  knowledge  acquisition6  (Gruber  and  Co¬ 
hen,  1986)  Knowledge  engineers,  experts,  and  users 
can  all  understand  triggering  without  thinking  aboul 
how  it  is  implemented.  A  virtual  machine  that  exe¬ 
cutes  “triggering”  is  easier  to  program. 

The  interactions  of  these  views  of  the  MU  architec¬ 
ture  are  apparent  in  the  design  of  knowledge  engineer¬ 
ing  tools.  Figure  7  shows  a  hierarchy  of  tools  that 
supports  development  of  systems  in  MU.  The  foun¬ 
dation  is  a  commerrially-available  Al  programming 
environment  that  includes  implementation  primitives 
such  as  rules  and  frames,  and  basic  Al  programming 
techniques  such  as  pattern-matching  rule  interpreters 
and  message-passing.  The  first  layer  in  Figure  7  is  a 
structural  description  of  the  implementation  of  MU.  It 
is  not  a  design  for  an  architecture,  because  no  func¬ 
tional  description  has  been  given  or  is  implied  by  this 
collection  of  implementation  primitives  and  A I  pro¬ 
gramming  techniques,  which  could  be  instantiated  to 
provide  a  wide  range  of  behaviors. 


^Without  task-level  terms,  the  (non-programmer)  expert,  is  effec¬ 
tively  barred  from  working  directly  with  the  knowledge  base. 
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Figure  7:  A  hierarchy  of  knowledge  engineering 

B.  Tools  for  the  MU  Architecture 

In  this  section  we  present  an  architecture  for  systems 
that  reason  under  uncertainty,  called  MU,  with  the  aim 
of  illustrating  how  the  three  views  of  architectures  in¬ 
fluence  the  design  of  knowledge  engineering  tools.  MU 
grew  out  of  experience  with  MUM  (Managing  Uncer¬ 
tainty  in  Medicine),  a  system  for  planning  a  series 
of  diagnostic  questions,  tests,  and  treatments  for  dis¬ 
eases  manifesting  chest  and  abdominal  pain  (Cohen, 
el  al.,  1986)  The  primary  aim  of  MUM  is  to  decide 
how  to  act  when  data  are  insufficient  for  diagnosis 
and  treatment.  Like  a  physician,  MUM  reasons  about 
tradeoffs  between  the  costs  of  evidence,  the  marginal 
utility  of  potential  data  given  what  is  already  known, 
the  effects  of  treatments  and  the.  evidence  they  pro¬ 
vide,  and  so  on.  MU  is  an  architecture  for  building 
systems  like  MUM  that  reason  about  uncertain  situa¬ 
tions  in  deciding  how  to  act. 

Viewed  from  a  functional  perspective,  MU  s  task  is 
managing  uncertainty  by  taking  appropriate  actions. 
Structurally,  MU  has  a  large  long-term  memory  of  hy¬ 
potheses  and  their  supporting  evidence  and  intermedi¬ 
ate  conclusions,  a  working  memory  of  developing  hy¬ 
potheses,  inference  mechanisms  for  propagating  the 
effects  of  evidence  in  working  memory,  and  control 
strategies.  Viewed  as  a  virtual  machine,  MU  sup¬ 
ports  knowledge  engineering  in  terms  that,  make  sense 
for  diagnostic  tasks,  such  as  hypothesis  and  potential- 
evidence.  These  terms  are  instantiated  for  specific  do¬ 
mains  by  terms  such  ns  disease ,  or  further  instantiated 
as  specific  diseases  such  as  angina. 

The  functional  view  of  an  architecture  constrains  how 
implementation-level  primitives  and  techniques  are 
specialized  for  a  particular  kind  of  problem-solving. 

The  funct  ional  requirements  of  Mt  are  that  it,  should 


tance  mechanisms,  assumption  maintena 
demon  invocation  and  message  passing, 
dow  system,  network  grapher 


tools  to  support  the  MU  architecture. 

represent  inferential  relations  between  data,  interme¬ 
diate  conclusions,  and  hypotheses.  It  should  maintain 
measures  of  belief  in  all  these  objects,  decide  focus 
of  attention  (i.e.,  which  object  eek  evidence  for), 
and  decide  which  evidence  to  seek.  At  the  second 
level  of  Figure  7,  the  frames  and  slots  of  the  first  level 
are  specialized  as  hypotheses  and  inferential  relations, 
respectively.  Rules  are  used  to  implement  combining 
functions  for  evidence  pro  and  con  hypotheses.  Some 
properties  of  hypotheses  -  a  subset  of  their  slot  values 
arc  used  as  control  parameters,  which  help  deter¬ 
mine  focus  of  attention.  Similarly,  value  propagation 
functions  are  implemented  via  the  demons  and  mes¬ 
sage  passing,  and  so  on.  Thus,  the  functional  view 
of  the  MU  architecture  tells  the  architecture  designer 
how  to  specialize  low-level  implementation  primitives 
and  techniques  to  achieve  a  virtual  machine,  or  shell, 
for  a  particular  class  of  tasks. 

An  architecture  is  designed  not  for  a  specific  task  like 
diagnosing  chest  pain,  but  for  a  class  of  tasks  such  as 
diagnostic  reasoning.  Thus,  the  knowledge  engineer 
and  expert  must  instantiate  architecture-level  prim¬ 
itives  for  a  particular  application  just  as  the  archi¬ 
tecture  designer  needed  to  specialize  implenentation- 
level  primitives.  Figure  8  is  a  structural  view  of  MUM 
a  chest  pain  specialist,  engineered  in  the  MU  ar- 
cliite  cture.  Hypotheses  are  instantiated  a-  diseases 
such  as  classic  angina;  intermediate  conclusions  are 
instantiated  as  clusters  such  as  ecercise -induced- pain; 
inferential  relations  such  as  potential  evidence  are  in¬ 
stantiated  by  specific  links  between  evidence  and  con¬ 
clusion,  such  as  EKG  results  and  classic  angina. 

Once  the  knowledge  engineer  has  decided  to  instan¬ 
tiate  hypotheses  as  diseases,  he  or  she  can  build  a 
knowledge-acquisition  interface  to  help  elicit  knowl 
edge  in  the  terms  of  the  architecture.  Meta-knowledge 
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about  the  terms  is  provided  by  the  knowledge  engineer 
while  designing  the  shell,  and  is  used  by  the  knowl¬ 
edge  acquisition  interface  to  help  the  user  build  a  syn¬ 
tactically  valid  and  semantically  consistent  knowledge 
base.  We  currently  have  form-filling  editors  for  ali  ob¬ 
jects  in  Figure  8,  a  graphics  interface  for  acquiring 
some  continuous  combining  functions,  and  rudimen¬ 
tary  consistency-checking  abilities;  other  tools,  espe¬ 
cially  for  acquiring  control  knowledge,  are  in  progress. 


Treatments 


Diseases 

(Hypotheses) 


Intermediate 

Conclusions 

(Clusters) 


Data 

Descriptions 


Figure  8:  Fragment  of  the  inference  network  for  MUM 


C.  Conclusions 

Architecture-level  knowledge  engineering  too'r-  have 
several  advantages: 

•  One  can  capitalize  on  vertical  integration  of 
implementation-level  tools  at  the  architecture 
level.  For  example,  a  general-purpose  frame  ■  di- 
tor  and  network  grapher  that  is  provided  at  the 
implementation  level  (such  as  the  KREME  inter¬ 
face  (Abrett  and  Burstein,  1987))  can  be  cus¬ 
tomized  as  a  knowledge  acquisition  interface  for 
editing  architecture-level  constructs  such  as  hy¬ 
potheses  and  diseases.  This  is  possible  because 
the  architecture-level  objects  are  specializations 
of  implementation-level  objects. 


•  Software  development  is  facilitated  because 
architecture-level  constructs  the  primitive  ob¬ 
jects  of  the  virtual  machine  are  represented 
declaratively.  For  example,  once  the  trigger  re¬ 
lation  has  been  designed,  one  need  not  worry 
about  several  members  of  a  software  project 


trying  to  achieve  its  functionality  by  different 
implementations.0  aims  at  making  every  imple¬ 
mentation  decision  explicitly  recorded  in  a  lan¬ 
guage  which  allows  a  program  writer  to  actually 
generate  the  code. 

a  Declarative  architecture-level  constructs  also  fa¬ 
cilitate  knowledge  acquisition  because  meta¬ 
knowledge  (Davis  and  Buchanan,  1984)  can  be  at¬ 
tached  to  objects  to  check  for  consistency,  provide 
help,  generate  explanations,  and  so  on.  For  exam¬ 
ple,  a  form-filling  interface  specialized  for  acquir¬ 
ing  an  instance  of  a  disease  can  use  a  declarative 
description  of  the  properties  of  diseases,  such  as 
the  kinds  of  relations  they  have  with  data,  to  of¬ 
fer  a  menu  of  documented  '•noices  (Gruber  and 
Cohen,  1986). 

a  Building  a  virtual  machine  at  the  architecture 
level  and  then  a  knowledge  acquisition  interface 
on  top  of  the  virtual  machine  defines  the  roles  of 
the  knowledge  engineer  and  expert.  The  knowl¬ 
edge  engineer  designs  an  architecture  by  special¬ 
izing  general-purpose  implementation-level  tools 
to  operationalize  the  constructs  suited  for  the 
problem  solving  task,  whereas  the  expert  instan¬ 
tiates  architecture-level  constructs  for  the  appli¬ 
cation  domain.  Virtual  machine  tools  (shell  sup¬ 
port)  assist  the  knowledge  engineer  in  customiz¬ 
ing  an  architecture  for  a  particular  application, 
and  knowledge  acquisition  tools  help  the  expert 
build,  refine,  and  debug  the  knowledge  base. 

IV.  Discussion 

The  hierarchy  of  tools  discussed  here  reflects  a 
power/generality  tradeoff.  Constructs  at  the  imple¬ 
mentation  level  are  general  (e.g.,  production  systems 
can  be  configured  for  many  kinds  of  problem  solv¬ 
ing)  but  from  the  standpoint  of  knowledge  engineering 
they  are  weak.  To  say  an  object  is  a  disease  hypothe¬ 
sis  is  to  imply  much  more  knowledge  about  it  than  to 
say  it  is  a  frame,  even  though  the  implementation  of 
the  disease  hypothesis  may  be  no  more  than  a  frame. 
This  added  knowledge  constrains  the  internal  struc¬ 
ture  of  the  disease  frame  (e.g.,  values  and  types  of 
slots,  or  the  kinds  of  messages  it  can  handle,  etc.),  con¬ 
strains  its  relationships  with  other  frames,  and  so  on. 
Since  these  constraints  facilitate  knowledge  engineer¬ 
ing,  architecture-level  objects  like  disease  frames  are 
at  the  “power”  end  of  the  power/generality  spectrum. 
Implementation-level  objects,  lacking  constraints,  are 
more  general  but  correspondingly  less  powerful  from 
the  standpoint  of  knowledge  engineering. 

Thus,  when  one  builds  an  expert  system  for  a  task, 
the  utility  of  an  architecture  level  analysis  depends 
entirely  on  how  much  one  knows  about  the  task.  The 
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knowledge  system  architecture  embodies  knowledge 
about  a  class  of  problem  solving  tasks  it  is  a  virtual 
machine  for  that  class  and  as  such  facilitates  system 
development  and  knowledge  acquisition  for  problem 
solves  -if  that  class.  The  power/generality  tradeoff 
tells  us  that  we  can  ameliorate  the  knowledge  acqui¬ 
sition  bottleneck  for  restricted  classes  of  tasks  by  de¬ 
signing  architectures  and  building  integrated  “power 
tools”  at  the  architecture  level. 


V.  Acquiring  Combining 
Knowledge  —  Modifiable 
Combining  Functions 

This  report  presents  a  synthesis  of  two  general  ap¬ 
proaches  to  combining  evidence.  When  designing 
knowledge  systems,  knowledge  engineers  typically  se¬ 
lect  one  approach  over  the  other,  hut  each  has 
strengths  and  weaknesses  in  terms  of  the  ease  with 
which  knowledge  can  be  acquired,  represented,  in¬ 
terpreted,  modified,  and  explained.  The  synthesis 
we  propose,  called  modifiable  combining  functions  has 
many  of  the  advantages  of  both  approaches  and  over¬ 
comes  some  of  their  disadvantages.  The  basic  idea  of 
modifiable  combining  functions  is  to  acquire  degrees 
of  belief  for  a  subset  of  all  possible  combinations  of 
evidence,  then  infer  degrees  of  belief  for  other  combi¬ 
nations  in  the  set.  If,  in  the  course  of  knowledge  en¬ 
gineering,  a  particular  degree  of  belief  is  challenged, 
then  it  (ar.d  others)  can  be  mV'dified  by  an  appropriate 
method. 

A  combination  of  evidence  is  a  list  of  propositions, 
each  with  an  associated  degree  of  belief,  for  exam¬ 
ple,  an  expert  system  for  diagnosing  plant  diseases 
has  propositions  like  this: 

((soil  texture  heavy,  .7) 

(soil  oxygen  low,  .9)) 

That  is,  soil  texture  is  believed  to  degree  .7  to  be 
heavy  and  soil  oxygen  is  strongly  believed  (.9)  to  be 
low.  Combinations  of  evidence  are  often  found  in  the 
premises  of  inference  rules.  These  rules  can  take  two 
forms,  railed  specified  and  derived: 

specified  form: 

IF  ((soil  texture  heavy,  .7) 

(soil  oxygen  low,  .9)) 

THEN  (water  damage  yes.  .8) 


derived  form: 

]F  ((soil  teu'tuie  —  heavy,  x) 

(soil  oxygew  =  low,  y)) 

THEN  (water  damage  =  yes  /  (x,  y,  k)) 

These  forms  suggest  two  general  approaches  to  com¬ 
bining  evidence.  The  specified  form  requires  that  for 
each  combination  of  degrees  of  belief  in  the  pre-ini',  e 
a  degree  of  belief  is  specified  for  the  conclusion  The 
derived  form  requires  a  function  /,  that  derives  a.  de¬ 
gree  of  belief  in  the  conclusion  for  any  degrees  of  belief 
in  the  premise.  The  constant  k  in  the  derived  form 
represents  the  degree  of  belief  that  would  be  assigned 
to  the  conclusion  if  the  degree  of  behef  in  the  premise 
was  1,0,  that  is,  the  degree  of  belief  :*i  the  inference 
rule  itself.  This  quantity  is  implicit  in  the  specified 
form. 

These  forms  combine  evidence  within  inference  rules, 
but  they  have  counterparts  lor  the  case;  in  which  two 
or  more  rules  draw  the  same,  corroborating  conclu¬ 
sion.  By  analogy  with  the  specified  form  degrees  of 
belief  can  be  acquired  for  each  combination  of  corrob¬ 
orating  rules;  or  a  general  function,  analogous  to  /  in 
the  derived  form,  can  be  acquired  to  calculate  degrees 
of  belief  for  all  corroborations. 

Both  approaches  have  been  used  in  A1  systems.  Con 
sidering  medical  expert  systems  alone  (reflecting  our 
own  interest  in  this  area),  we  find  kne;  edge  in  the 
specified  form  in  I’ll’  (Banker,  et  al.,  1 970) ,  IRIS 
(Trigoboff,  1978),  MDX  (Chandrasekeran,  Mittal  and 
Smith,  1982),  and  MUM  (Cohen,  el  al.,  1986);  while 
MYC1N  (Shortliffe,  1976;  Shortliffe  and  Buchanan. 
1975),  1NTEU N1ST/CADUCEUS  (Pople,  1977)  an 4 
ABLE  (1’ntil,  Szolovits,  and  Schwartz,  1976)  use 
knowledge  in  the  derived  form. 

In  outline,  we  describe  representations  for  combining 
functions  that  are  closely-related  to  the  specified  and 
derived  forms.  We  will  discuss  the  tradeoffs  between 
these  approaches  that  motivate  the  idea  of  modifiable 
combining  functions.  We  will  illustrate  how  modifi¬ 
able  combining  functions  are  generated  and  modified 
in  the  context  of  an  example.  Parts  of  the  theory 
of  modifiable  combining  functions  have  been  imple¬ 
mented  in  a  medical  expert  system  (Cohen,  et  al., 
1986),  but,  much  of  this  report  should  be  taken  as  re¬ 
search  in  progress. 


A.  Forms  of  combining  functions 

J.  Tabular  combining  functir  ns 

Tubular  con  mining  functions  are  often  represented  as 
tables  that  specify  degrees  of  belief  in  conclusions  for 
each  combination  of  degrees  of  belief  of  evidence.  Fig¬ 
ure  I  shows  a  tabular  function  that  combines  two 


31 


V.V.V 


o at k .ttesonury.'t \m\l«vxv yv y. v« y .  y  ■  a  -\ y. j-.vv  /.v.-ap: 


beKEj ) 

1.0  75  .50  .25  0  -.25  -.50  -.75  -1.0 


1.0 

1.0 

1  0 

.75 

.50 

0 

R 

R 

R 

75 

1.0 

1  0 

50 

B 

R 

R 

50 

0 

0 

0 

i 

R 

R 

25 

25 

0 

0 

0 

0 

R 

R 

0 

0 

0 

0 

0 

0 

R 

R 

-25 

■ 

.50 

■ 

■ 

R 

11 

-.75 

0 

■ 

-75 

H 

-1 

-1 

-10 

R 

R 

H 

H 

H 

H 

-1 

Figure  I 

pieces  of  evidence,  E*  and  E2,  for  conclusion  C.  In  this 
case,  degrees  of  belief  in  evidence  range  from  -1  to  +1, 
denoting  complete  disbelief  and  belief,  respectively.  A 
degree  of  belief  of  zero  denotes  ignorance;  for  exam¬ 
ple  (soil  oxygen  —  low,  0)  means  that  the  value  of  soil 
oxygen  is  unknown,  either  because  it  is  an  unavailable 
datum  or  because  the  data  from  which  it  is  inferred 
are  ambiguous.  Many  of  the  cells  are  blank,  meaning 
that  the  expert  does  not  consider  these  combinations 
of  evidence  relevant  -  does  not  expect  them  to  arise 
during  problem  solving.  From  the  knowledge  engi¬ 
neer’s  perspective,  blank  cells  and  zero  cells  represent 
different  events.  A  blank  means  that  that  particular 
combination  of  evidence  was  never  considered,  but  a 
zero  means  it  was  considered  and  found  to  be  unin¬ 
formative.  From  the  perspective  of  an  AI  program’s 
interpreter,  blank  and  zero  may  both  mean  that  the 
combination  of  evidence  is  uninformative;  or  a  blank 
may  be  used  to  alert  the  user  to  incompleteness  in  the 
combining  function. 

In  tabular  combining  functions,  degrees  of  belief  in 
evidence  index  degrees  of  belief  in  conclusions.  The 
combining  function  in  Figure  1  specifies  that  when  the 
degrees  of  belief  in  Ei  and  E2  are  .5  and  .75,  respec¬ 
tively,  the  degree  of  belief  in  C  is  .25.  Since  conclu¬ 
sions  are  often  used  as  evidence  for  subsequent  infer¬ 
ences,  the  cells  in  tabular  combining  functions  contain 
values  that  can  themselves  be  used  to  index  degrees 
of  belief  in  other  tabular  functions.  Tabular  functions 
increase  exponentially  in  size:  A  function  for  N  pieces 
of  evidence  requires  an  N-dimensional  table,  similar 
to  the  signature  tables  invented  by  Samuel  (1959). 


Some  important  knowledge  about  patterns  or  regular¬ 
ities  in  combinations  of  evidence  is  implicit  in  tabular 
combining  functions.  For  example,  the  entire  upper- 
right  quadrant  of  Figure  1  is  blank,  suggesting  that  no 
combination  of  positive  degrees  of  belief  in  E]  and  neg¬ 
ative  degrees  of  belief  in  E2  is  meaningful.  Similarly, 
in  the  lower-left  quadrant  we  see  a  threshold  on  the 
degree  of  belief  in  Ep  the  values  in  the  table  are  de¬ 
termined  by  E2  for  all  values  of  Ej  less  than  or  equal  to 
-.75.  These  regularities  are  easily  captured  by  a  rule- 
based  variant  of  tabular  combining  functions.  The  two 
examples  we  just  mentioned  can  be  represented  this 
way: 


Upper-right  quadrant: 

IF  bel(Ej)  >  0  and 

bel(E2)  <  0 

Then  bel(C)  :=  0 

Lower-left  quadrant: 


IF  bel(£,) 

<  —.75  and 

bel(£2) 

=  .5  or 

be!  (£2) 

=  .25 

Then  bel(c) 

:=  -.75 

IF  bel(£,) 

<  -.75  and 

bel(£2) 

=  .75 

Then  bel(c) 

:=  -.5 

Irrespective  of  whether  the  knowledge  engineer  ac¬ 
quires  tables  like  Figure  1,  or  rules  as  above,  he  or 
she  must  take  care  to  maintain  important  distinctions 
in  the  domain.  For  example,  the  rule  for  the  upper- 
right  quadrant  could  be  extended  to  account  for  the 
blank  cells  in  the  lower-right  quadrant,  too,  by  chang¬ 
ing  its  first  clause  to  “IF  bel(Ei)  >  -.5.”  While  this 
rule  describes  the  table,  it  obscures  what  may  be  an 
important  distinction  between  positive  and  negative 
values  for  E| . 

Tabular  combining  functions  and  their  rule-based 
variant  are  ways  to  represent  combinations  of  evidence 
given  in  the  specified  form,  described  above.  A  rep¬ 
resentation  that  relies  on  both  specified  and  derived 
combinations  is  discussed  next. 

2.  Interpola-^d  combining  functions 

Three  of  the  four  corner  cells  of  Figure  1  represent 
degrees  of  belief  in  the  conclusion  given  categorical 
(certain)  data  about  E,  and  E2  (the  upper-right  cell 
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is  blank  because  nothing  is  known  about  it.)  They 
can  be  arranged  in  a  categorical  table  as  shown  below. 
To  distinguish  categorical  tables  from  the  larger  ones 
like  Figure  1,  we  call  the  latter  full  tables. 
bel(El): 

1  I  -1 

1 

bel(E2):  — 

-1 


1 

0 

-1 

The  upper-left  cell  contains  the  degree  of  belief  in  C 
given  that  E\  and  Ei  are  both  true;  conversely,  the 
lower-right  cell  is  the  degree  of  belief  in  C  when  both 
are  false;  the  0  in  the  lower-left  cell  represents  igno¬ 
rance  in  C  given  that  E\  is  true  and  E2  is  false.  To 
reiterate,  these  are  the  corner  cells  of  the  full  table  in 
Figure  1.  All  other,  norcorner  cells  in  Figure  1  rep¬ 
resent  interpolations  between  the  the  values  in  this 
categorical  table,  interpolations  due  to  uncertainty  in 
Ej  and  E2.  For  example,  the  cells  around  the  center 
of  Figure  1  tend  toward  the  value  0,  since  the  center 
cell  represents  the  case  in  which  the  degrees  of  belief 
in  Ei  and  E2  are  both  zero,  that  is,  completely  unin¬ 
formative.  Similarly,  in  the  lower  half  of  the  table,  we 
see  degrees  of  belief  in  C  ranging  from  0  when  bel (Ea ) 
=  +1,  to  -1  for  lower  degrees  of  belief  in  E2. 

The  full  table  in  Figure  1  was  built  by  hand,  but 
full  tables  can  also  be  derived  by  interpolating  func¬ 
tions.  Figure  2  shows  the  derivation  of  a  full  tabic 
by  a  Bayesian  interpolating  function.  The  categori¬ 
cal  corner  cells  are  1.0,  .95,  .25,  and  0.0,  respectively. 
All  other  cells  contain  intermediate  values  that  reflect 
uncertainty  about  the  evidence.  For  example,  when 
the  degrees  of  belief  in  episode  and  risk  factors  are 
both  .75,  the  degree  of  belief  in  the  conclusion  is  .79. 
a  value  intermediate  between  the  four  corner  points 
but  nearer  to  1.0  —  its  nearest  neighbor  —  in  magni¬ 
tude.  This  table  and  its  derivation  will  be  explained 
in  Section  5. 

To  summarize,  full  tables  can  be  built  by  hand,  by 
specifying  the  value  in  each  cell,  or  specifying  rules 
that  assert  the  values  of  subsets  of  the  cells.  Alter¬ 
natively,  they  can  be  derived  automatically  by  inter¬ 
polating  from  categorical  tables.  Once  the  derision 
has  been  made  to  use  interpolating  functions,  full  ta¬ 
bles  are  usually  not  generated  and  stored.  Instead, 
the  values  of  combinations  of  evidence  are  computed 
as  needed.  However,  the  following  Section  suggests 
that  there  are  advantages  to  keeping  both  forms  of 
combining  functions. 


B.  Comparison 

Our  comparison  will  focus  on  the  tabular  and  inter¬ 
polating  forms  of  combining  functions.  The  strengths 
of  one  often  correspond  to  weaknesses  in  the  other. 


First,  tabular  combining  functions  do  not  infer  any¬ 
thing  that  is  not  stated  by  the  expert.  Most  of  the 
cells  in  a  table  are  blank,  meaning  that  the  expert 
does  not  consider  them  to  represent  meaningful  com¬ 
binations  of  evidence.  In  theory,  every  nonblank  cell 
represents  a  meaningful  combination  and  every  blank 
cell  represents  a  meaningless  one.  But  in  practice, 
the  sheer  size  of  tabular  functions  means  that  some 
meaningful  combinations  of  evidence  are  simply  over¬ 
looked  during  knowledge  acquisition.  In  this  sense, 
tabular  functions  are  brittle:  they  cannot  account  for 
all  meaningful  situations  that  will  arise  during  prob¬ 
lem  solving. 

Interpolation  is  clearly  a  solution  to  the  brittleness 
problem,  since  the  value  of  any  blank  cell  can  be 
inferred  from  the  corners  of  a  categorical  table,  or 
perhaps  from  its  “nearest  neighbors.”  The  disadvan¬ 
tages  of  interpolating  functions  are  that,  unlike  tabu 
lar  functions,  they  produce  values  for  all  combinations 
of  evidence  in  their  domain,  meaningful  or  not.  More¬ 
over,  no  value  derived  by  an  interpolating  function  is 
guaranteed  to  reflect  an  expert’s  judgment.  A  subtler 
problem  is  that  interpolation  produces  a  continuous 
gradient  of  values  between  the  corners  of  the  full  ta¬ 
ble.  But  expert’s  degrees  of  belief  in  conclusions  are 
unlikely  to  change  continuously  with  the  degrees  of 
belief  in  the  evidence.  Thresholds  are  common,  as 
illustrated  by  the  rule-based  variant  of  tabular  func¬ 
tions. 

Tabular  functions  are  locally  modifiable ,  meaning  that 
a  knowledge  engineer  can  change  the  values  of  indi¬ 
vidual  cells  in  the  table  with  the  assurance  that  the 
performance  of  the  system  will  remain  unchanged  ex¬ 
cept  in  the  rases  of  these  particular  combinations  of 
evidence.  This  allows  a  combining  function  to  be 
“tuned”  in  the  normal  course  of  knowledge  base  re¬ 
finement:  when  the  system  presents  a  conclusion  that 
the  expert  thinks  is  wrong,  and  the  source  of  the  er¬ 
ror  is  localized  to  a  particular  cell,  then  that  cell  ran 
be  changed.  In  contrast,  changing  an  interpolating 
function  necessarily  effects  the  values  assigned  to  all 
combinations  of  evidence  in  its  domain.  Modifying  an 
interpolating  function  is  essentially  redesigning  one’s 
inference  system  (Gruber  and  Cohen,  1987). 

C.  Modifiable  Combining  Functions 

Once  the  knowledge  engineer  considers  using  interpo¬ 
lating  functions,  why  bother  to  acquire  full  tables  by 
hand?  Why  not  simply  acquire  categorical  tables,  as 
above,  and  design  interpolating  functions  to,  in  effect, 
“fill  in”  the  intermediate  values?  Clearly,  the  two  ap¬ 
proaches  are  equivalent  if  the  interpolating  functions 
generate  the  same  values  as  I  he  expert  for  any  com¬ 
bination  of  evidence.  But  there  is  no  way  to  test  this, 
other  than  to  acquire  an  entire  table  and  then  com¬ 
pare  it  with  the  results  of  an  interpolation  function. 
Consequently,  the  knowledge  engineer  can  take  one  of 
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two  positions  with  respect  to  potential  differences  be¬ 
tween  interpolated  values  and  the  expert’s  judgment: 

•  The  knowledge  engineer  can  design  a  function 
that  has  desirable  properties  and  assume  that, 
if  the  expert’s  judgment  is  different,  it  is  because 
the  expert’s  reasoning  is  inconsistent  or  otherwise 
flawed. 

•  The  knowledge  engineer  can  design  a  function 
that  is  assumed  to  reflect  expert  judgment,  but 
modify  it  to  conform  to  the  expert  when  devia¬ 
tions  become  ap  'arent. 

The  first  position  is  associated  with  normative  models, 
the  second  with  performance  models.  In  both  cases, 
the  knowledge  engineer  must  carefully  design  interpo¬ 
lation  functions  given  what  he  knows  and  can  assume 
about  the  evidence  in  a  domain.  In  the  latter  case,  in 
addition,  he  must  have  some  mechanism  for  modifying 
combining  functions. 

Modifiable  combining  functions  are  a  synthesis  of  tab¬ 
ular  and  interpolating  functions.  They  are  tabular 
functions  that  have  most  of  their  values  derived  by 
interpolation,  but  that  can  be  modified  to  conform  to 
an  expert’s  judgment.  Knowledge  engineers  must  first 
acquire  a  categorical  table  and  any  other  cells  in  the 
full  table  that  the  expert  can  provide.  Interpolating 
functions  ideally  should  fill  in  cells  that  the  expert  and 
knowledge  engineer  neglected  to  specify,  with  values 
that  are  likely  to  match  the  expert’s  judgment,  but 
not  fill  in  cells  they  intended  to  leave  blank.  If  these 
goals  are  not  achieved,  the  tabular  function  can  be 
modified  by  one  of  the  three  mechanisms  discussed 
below. 


D.  An  Example 

This  section  illustrates  modifiable  combining  func¬ 
tions  for  two  pieces  of  evidence  fro  ”  medical  diagno¬ 
sis  problem.  Most  diagnosis  begins  v  .1  the  physician 
taking  a  history:  asking  about  the  patient’s  chief  com¬ 
plaint,  age,  past  medical  history,  and  so  on.  Our  ex¬ 
ample  concerns  the  diagnosis  of  angina  and  two  pieces 
of  evidence  from  the  history:  the  patient’s  report  of 
an  episode  of  chest  pain,  and  whether  the  patient  has 
risk  factors  for  angina.  Clearly,  other  evidence  plays 
a  role  in  diagnosis,  but  we  will  focus  on  a  single  rule 
that  infers  that  the  patient’s  history  is  consistent  with 
angina  if  he  or  she  has  a  characteristic  episode  and  risk 
factors: 

episode  fr  risk  factors  — *  angina  history 

Both  pieces  of  evidence  can  be  uncertain  because  each 
depends  on  several  observations.  For  the  purpose  of 
this  example,  assume  that  degrees  of  belief  in  episode 
and  risk,  factors  arc  subjective  probabilities  ranging 


from  0.0  to  1.0.  The  interpretation  of  P^pisode) 
=  0  is  “the  episode  is  i  ot  characteristic  of  angina.” 
An  intermediate  degree  of  belief,  say  P( episode)  = 
.5,  means  “-ome  aspects  of  the  episode  are  consistent 
with  angina,  but  other  aspects  are  missing.”  The  fol¬ 
lowing  examples  illustrate  assessments  of  degrees  of 
belief  for  particular  observations: 

•  crushing  chest  pain,  induced  by  exercise,  lasting 
a  few  minutes,  radiating  to  one  or  both  arms,  ac¬ 
companied  by  sweating  and  shortness  of  breath: 
P  (episode)  =  1.0 

•  sharp,  fleeting  chest  pain,  induced  by  sudden 
movement,  not  radiating:  P(episode)  =  0.0 

•  diffuse  chest  pain,  came  on  after  eating,  radiating, 
lasting  about  30  seconds:  P(episode)  =  0.5 

•  60  year-old  male,  overweight,  smoker,  with  high 
blood  pressure,  and  two  brothers  with  coronary 
artery  disease:  P(ris&  factors)  =  1.0 

•  30  year-old  female,  nonsmoker,  not  overweight, 
normal  blood  pressure,  no  history  of  heart  disease 
in  the  family:  P (risk  factors)  =  0.0 

•  45  year-old  male,  smoker,  not  overweight, 
marginally-high  blood  pressure,  uncle  had  coro¬ 
nary  at  age  60:  P(risk  factors)  =  .5 

Given  that  episode  and  risk  factors  can  be  uncertain, 
how  should  a  knowledge  engineer  acquire  knowledge 
about  the  combinations  of  this  evidence  that  support 
(or  detract  from)  the  conclusion?  Degrees  of  belief 
for  ali  possible  combinations  could  be  acquired  in  the 
specified  form,  and  arranged  in  a  tabular  combining 
function.  Alternatively,  the  knowledge  engineer  might 
design  a.  combining  function,  /,  and  derive  the  degrees 
of  belief  of  combinations  by  interpolation. 

Modifiable  combining  functions  present  an  intermedi¬ 
ate  alternative:  the  knowledge  engineer  acquires  some 
degrees  of  belief  for  a  subset  of  the  possible  combina¬ 
tions,  then  designs  a  function  to  interpolate  the  val¬ 
ues  of  the  rest  and  arranges  the  results  in  a  table, 
then  modifies  the  table  if  necessary  to  accord  with 
the  expert’s  judgment.  An  obvious  place  to  begin  this 
process  is  with  the  categorical  table,  from  which  a 
full  table  can  be  interpolated.  Imagine  the  following 
rules,  qualified  by  degrees  of  belief,  are  acquired  from 
the  expert: 

episode  &  risk  factors  — *  angina  history  ,  1  0 
episode  &  ~  risk  factors  angina  history  , 

.95 

~  episode  &  risk  factors  — ►  angina  history  , 

.25 

~  episode  fr  ~  risk  factors  —> >  angina  history 

,  0.0 

These  can  be  arranged  in  the  following  categorical  ta¬ 
ble: 
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P(risk  factors): 


P  (episode) 

1  I  0 


1.0 

.25 

.95 

0.0 

The  knowledge  engineer  needs  to  design  a  function 
from  which  P( angina  history)  can  be  derived  for  val¬ 
ues  of  P(epi’sode)  and  P(risk  factors )  other  than  1 
and  0.  Such  functions  reflect  the  knowledge  engineer’s 
assumptions  about  the  domain.  We  wiii  illustrate  a 
Bayesian  function  designed  under  the  assumption  that 
P( episode)  and  P(risk  factors)  are  independent. 

The  Bayesian  interpolation  function  is  derived  from 
the  rule  of  total  probability,  which  says 


p(a)=  y  pm)p(Bi) 

where  Bt,  ...  ,  Bn  is  an  exhaustive  list  of  mutually 
exclusive  possibilities.  For  our  example,  A  is  the  con¬ 
clusion  angina  history  and  B i,  ...  ,  Bn  is 


episode  k  risk  factors 
episode  k  ~  risk  factors 
~  episode  k  risk  factors 
~  episode  k  ~  risk  factors 

Then,  P(an<pna  history)  can  be  derived  for  any  de¬ 
grees  of  belief  in  episode  and  risk  factors  as  follows: 

P(a)  = 

P(a  |  e  k  r)  P(e  k.  r) 

+  P(a  |  e  k  ~  r)  P(e  k  ~  r) 

+  P(a  |  ~  e  k  r)  P(~  e  k  r) 

4-  P(a  |  ~  e  k  ~  r)  P(~  e  k  ~  r) 

where  episode,  risk  factors  and  angina  history  are  ab¬ 

breviated  e,r,  and  a,  respectively. 

The  values  of  the  conditional  terms  in  this  expression 
have  already  been  acquired  from  the  expert  and  are 
recorded  in  the  categorical  table  (e.g.,  P(a\e  k  r)  = 
1.0,  P(a\e  k  ~  r)  =  .95...).  The  knowledge  engineer 
now  must  decide  whether  to  acquire  the  other  terms  in 
the  expression  P(e  k  r),  P(e  k  ~r)  ...  This  effort  can 
be  avoided  by  assuming  that  e  and  r  are  independent, 
in  which  rase  P(e  k  r)  =  P(e)P(r),  and 

P(°)  = 

P(a  |  e  k  r)  P(e)P(r) 

+  P(a|  e&~  r)  P(e)P(~  r) 

+  P(a  |  ~  ek  r)  P(~  e)P(r) 

+  P(a  |  ~  ek  ~  r)  P(~  e)P(~  r) 

or, 

P(a)  =  (1) 

P(a  |  e  k  r)  P(e)P(r) 

+  P(a  |  e  &  ~  r)  P(e)[l  -  P(r)| 

+  P(«  |  ~  e  &  r)  [1  -  P(«)]P(r) 

+  P(a|~efc~r)  [1- P(e)][l  -  P(r)]  . 


Figure  2  illustrates  a  full  table  containing  the  values 
of  P(o)  derived  by  this  function  from  these  categorical 
values 

P(a  |  e  k  r)  =  1.0 
P(a  |  e  k  ~  r)  =  .95 
P(o  |  ~  e  k  r)  =  .25 
P(o  |  ~  e  k  ~  r)  =  0 

and  letting  P(e)  and  P(r)  range  through  the  values  0, 
.125, .25,  .375,  .5,  .625,  .75,  .825,  and  1.0. 


P(ettsode) 

1.0  .875  .75  .625  .50  .375  .25  125  C 


P(riak  factir 

1.0 

— 

1.0 

.91 

81 

72 

63 

53 

44 

34 

.25 

.875 

99 

90 

80 

70 

.61 

51 

41 

32 

22 

75 

99 

09 

79 

69 

59 

49 

39 

29 

19 

.625 

98 

80 

78 

67 

57 

47 

36 

26 

.16 

.50 

90 

67 

76 

66 

55 

44 

34 

23 

13 

.375 

97 

06 

75 

64 

53 

42 

31 

20 

09 

25 

96 

85 

74 

63 

51 

40 

29 

18 

06 

.125 

96 

84 

73 

61 

49 

38 

26 

.15 

03 

0 

95 

83 

71 

59 

40 

36 

24 

12 

0 

Figure  2 

The  Bayesian  function  (1)  is  an  example  of  what  is 
sometimes  called  Jeffrey’s  rule  (Shafer,  1981;  Shafer 
and  Tverskv,  1985).  in  such  a  design  the  conditional 
probabilities  P(a|e&r)  ...  reflect  the  expert’s  heuris¬ 
tic  judgments  based  on  previous  cases  of  angina.  In 
contrast,  the  unconditional  probabilities  P(ekr)  = 
P(e)P(r)  ...  reflect  knowledge  about  the  individual 
patient  who  is  currently  being  diagnosed.  This  is  be¬ 
cause,  to  calculate  P(ekr),  we  assume  that  the  prob¬ 
ability  of  an  angina  episode  is  independent  of  whether 
one  is  at  risk.  This  is  true  for  an  individual  patient: 
for  this  patient  the  probability  of  an  angina  episode  is 
independent  of  the  probability  that  he  is  at  risk.  This 
patient  either  has  risk  factors  in  addition  to  his  angina 
episode  or  he  doesn’t.  Thus,  the  decision  to  design  a 
Bayesian  function  for  which  P(ekr)  =  P(e)P(r)  im¬ 
plies  that  the  expert’s  knowledge  of  patients  in  gen¬ 
eral  is  dominated  by  Ids  knowledge  of  the  probabilities 
P(c)  and  P(r)  for  the  individual  patient. 

Consider  how  this  assumption  might  lead  to  a  conflict 
with  the  expert’s  judgment.  In  general, 

P(e  k  r)  P(e\r)P(r) 
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or,  if  P(e)  and  P(r )  are  independent,  then  P(ejr)  = 
P(e)  and 


P(e  &  r)  =  P(e)P(r) 

But  this  seems  wrong  because,  in  general,  the  prob¬ 
ability  of  an  episode  given  risk  factors  is  higher  than 
the  probability  of  the  episode,  or 

P(e\r)P(r)  >  P(e)P(r) 

Consequently,  in  some  cases,  P(e  &  r)  will  be  too  low, 
and  so  the  value  of  P(a)  denoted  by  (2)  will  be  too 
low,  as  well.  For  example,  according  to  Figure  2, 
if  P(eptsode)  =  .5  and  P(risk  factors)  =  .75,  then 
P(angina  history)  =  .59.  But  in  the  course  of  test¬ 
ing  a  system,  the  expert  may  challenge  this  result. 
He  may  say  that  if  there  is  moderate  evidence  of  an 
episode  and  strong  evidence  of  risk  factors  then  the 
probability  of  angina  history  should  be  much  higher, 
say,  0.75. 

What  should  the  knowledge  engineer  do  in  this  case? 
If  he  is  relying  exclusively  on  interpolating  functions 
then  he  has  3  options: 

1.  insist  that  the  expert’s  judgment  is  flawed 

2.  change  the  categorical  table 

3.  change  the  interpolating  function 

The  first  is  practical  only  if  the  knowledge  engineer  is 
confident  that  the  assumptions  that  underlie  his  in¬ 
terpolating  function  are  reasonable.  The  other  two 
have  global  effects  on  all  the  numbers  in  the  table, 
not  just  the  few  the  expert  criticized.  Thus,  in  fix¬ 
ing  the  immediate  problem  the  knowledge  engineer 
could  introduce  new  ones.  Knowledge  engineering  of¬ 
ten  extends  over  a  period  of  months,  and  the  knowl¬ 
edge  engineer  relies  on  a  kind  of  monotonicity  —  the 
idea  that  adding  new  knowledge  to  a  system  will  not 
make  it  perform  differently  on  the  majority  of  previ¬ 
ous  canes.  Changing  the  categorical  table  has  ram¬ 
ifications  only  for  the  inference  rule  with  which  it  is 
associated,  but  changing  an  interpolation  function  will 
change  the  degrees  of  belief  of  all  the  conclusions  de¬ 
rived  by  that  function  —  potentially  every  conclusion 
previously  derived  by  a  knowledge  system. 

If  the  knowledge  engineer  does  decide  to  change  the 
function,  how  should  he  go  about  it?  We  could  change 
(I)  by  eliminating  the  independence  assumption  and 
acquiring  the  required  conditional  probabilities  from 
the  expert.  Or,  we  might  design  a  completely  new 
Bayesian  function  that  exploits  the  causal  associa¬ 
tions  between  the  evidence  and  the  conclusion  (Pearl, 
1986).  Or,  we  could  conclude  that  a  belief-function 
design  better  characterizes  the  relationship  between 


7The  authors  are  currently  working  on  a  formulation  of  modifiable 
combining  functions  based  on  belief  functions.  The  formulation  is 
preliminary,  and  space  limitations  preclude  introducing  it  here 


the  evidence  and  the  conclusion  (Shenoy  and  Shafer, 
1986)7.  Many  interpolation  schemes  are  possible, 
but  most  of  them  are  mathematically  complicated, 
or  computationally  expensive,  or  require  many  more 
numbers  than  the  expert  can  accurately  provide.  The 
Bayesian  function  above  (1)  is  very  simple  and  re¬ 
quires  few  numbers.  Its  major  deficit  is  that,  in  a 
few  cases,  it  produces  numbers  with  which  the  expert 
disagrees. 
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1.0  .875  .75  625  50  375  .25  125  0 


P(risk  faclory 
1.0 


.875| 

.75 

.625| 

.50 

.3751 

25 

125 


* - -1 

1.0 

■ 

91 

81 

72 

63 

.53 

44 

.34 

25 

99 

90 

80 

.70 

61 

.51 

41 

32 

.22 

99 

89 

79 

69 

59 

49 

.39 

29 

19 

98 

88 

78 

67 

57 

47 

36 

26 

.16 

98 

87 

76 

66 

55 

.44 

34 

23 

.13 

97 

86 

75 

64 

.53 

42 

,31 

20 

.09 

96 

85 

74 

63 

51 

40 

29 

18 

.06 

.96 

84 

.73 

61 

49 

.38 

26 

.15 

.03 

.95 

.83 

71 

59 

48 

36 

.24 

12 

0 

Figure  3 

If  the  knowledge  engineer  does  not  rely  exclusively  on 
interpolating  functions  to  calculate  degrees  of  belief, 
then  he  has  another  option  besides  the  three  listed 
above:  He  can  simply  change  the  values  that  the  ex¬ 
pert  says  are  wrong  and  store  the  new  values  in  a 
tabular  form  that  overides  the  derived  values.  I  he 
idea  of  modifiable  combining  functions  is,  in  essence, 
to  use  simple  interpolating  functions  to  derive  full  ta¬ 
bles  from  categorical  tables,  then,  when  the  expert 
criticizes  a  derived  degree  of  belief,  to  simply  change 
it.  This  is  shown  in  Figures  3  and  4.  In  Figure  3,  the 
expert  identifies  a  block  of  cells  with  values  that  are 
too  low,  for  the  reasons  we  discussed  earlier.  Figure  4 
shows  one  possible  modification. 

In  sum,  modifiable  combining  functions  offer  three 
methods  for  representing  expert  judgments  about 
combinations  of  evidence.  First,  individual  cells,  or 
blocks  of  cells  in  a  derived  tabular  function  can  be 
changed.  Second,  the  value  in  the  categorical  table 
can  be  changed.  Third,  and  as  a  last  resort,  the  inter¬ 
polating  function  can  redesigned. 
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Figure  A 


E.  Conclusion 

Modifiable  combining  functions  share  many  of  the  ad¬ 
vantages  of  tabular  and  interpolating  functions  while 
avoiding  some  of  their  disadvantages.  The  informa¬ 
tion  burden  of  tabular  functions  is  reduced  because 
the  full  table  is  derived  by  interpolating  from  the  val¬ 
ues  the  expert  can  provide.  (One  natural  basis  for  the 
interpolation  is  the  categorical  table,  but  others  are 
possible.)  The  brittleness  of  tabular  combining  func¬ 
tions,  especially  multidimensional  ones,  is  overcome. 
Simple  interpolating  functions  can  be  used,  requiring 
relatively  few  numbers  from  the  expert.  Then,  any 
values  in  the  derived  full  table  can  be  overridden  by 
the  expert’s  judgment.  Discontinuities  can  easily  be 
expressed  in  the  rule-based  variant  of  tabular  com¬ 
bining  functions.  When  an  interpolating  function  fills 
ill  cells  that  the  expert  thinks  should  be  blank  (mean¬ 
ingless),  the  function  can  be  modified  accordingly.  All 
modifications  to  cells  are  local  in  the  sense  that  they 
afTect  the  system’s  performance  for  combinations  of 
evidence  represented  by  those  cells  only.  But  if  global 
modifications  are  appropriate,  if  all  the  values  in  n 
modifiable  combining  function  seem  wrong  to  the  ex¬ 
pert,  then  the  knowledge  engineer  can  first  consider 
modifying  the  categorical  table  (or  any  other  set  of 
points  used  for  interpolation)  and  then  consider  mod¬ 
ifying  the  interpolation  function. 

Currently,  we  are  acquiring  tabular  combining  func¬ 
tions  for  a  medical  expert  system  (Cohen  et  al.,  1987) 
and  a  plant  pathology  system.  They  are  represented 
as  rules,  as  discussed  above.  We  have  built  interfaces 
for  acquiring  and  modifying  these  rules,  and  we  have 
almost  completed  a  graphic  interface  for  representing 
them  in  tabular  form. 


VI.  A  Notation  for 
Representing  Strategies 

Our  current  approach  to  encoding  strategies  in  MU 
planners  has  been  to  divide  the  decision  space  into 
phases  whose  applicability  is  decided  by  a  test  on  the 
state  of  the  network.  Each  phase  then  provides  a 
means  for  deciding  the  current  focus.  Potential  ev¬ 
idence  for  the  hypotheses  is  obtained  and  each  phase 
may  further  specify  filtering  criteria  to  form  a  subset 
of  these  actions  to  actually  undertake.  At  this  point 
though,  we  don’t  do  anything  very  intelligent.  We 
have  either  considered  these  remaining  actions  as  a 
"plan”  and  just  executed  them,  as  in  the  latest  ver¬ 
sion,  or  undertaken  one  and  seen  what  changes  oc¬ 
curred;  cycling  through  the  phases  again  to  see  which 
one  is  now  most  appropriate. 

Looking  at  some  of  the  strategies/tactics  thai  we  have 
obtained  from  the  medical  domain,  it  appears  that 
a  richer  representation  at  this  action  level  would  al¬ 
low  us  to  implement  more  sophisticated  plans.  In  dis¬ 
cussing  possible  orderings  of  actions  we  found  many 
of  the  operations  involved  forming  and  manipulating 
sets.  Further,  we  recognized  the  existence  of  these  sets 
by  Feature  predicates  on  the  possible  actions.  It  be¬ 
came  helpful  to  have  a  notation  to  express  these  sets 
and  their  orderings  and  this  is  what  we  are  currently 
working  on. 

Consider  that  we  have  already  selected  a  focus  set 
of  hypotheses  and  have  obtained  actions  relevant  to 
them  and  done  some  type  of  filtering  on  those  actions. 
We  have  a  plan  which  may  be  applicable  at  this  point 
which  says  -  ”if  there  are  two  actions  we  can  take 
w.r.t.  a  hypothesis,  and  one  has  high  cost  and  high 
diagnosticity,  while  the  other  has  low  cost  and  low 
diagnosticity  -  then  perform  the  action  with  low  cost 
first  and  if  it  has  a  positive  effect  do  the  higher  cost 
one” .  We  need  to  compare  our  actions  on  two  different, 
features  - 

Cost  and  Diagnosticity 

We  know  MU  will  allow  us  to  define  these  and  other 
features  so  we  can  write  a  general  expression  to  denote 
this  situation: 

1  X,  Y,  1 1  [  /•’,  (X,  Value,  II)  A 
F2  (X,  Value,  11)  A. 

F,  (Y,  Value,  II)  A 
F2  (Y,  Value,  11)] 

Where  F\  and  /’2  are  features  and  X  and  1  are  actions, 
with  Value  being  the  current  value  that  action  has  for 
that  particular  feature,  and  II  the  hypothesis  under 
consideration. 
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If  actions  A  and  B  are  the  variables  that  we  place  into 
an  instantiation  of  the  above  form,  say  - 

Cost(A,  High,  Hypoi)  Si 

Diagnosticity(A,  High,  Hypo\)  Si 

<Jost(B,  Low,  Hypoi)  Sc 

Diagnosticity(B,  Low,  Hypoi) 

Then  we  want  to  say  something  like  -  {B  :  A,  C}, 
Where  is  a  shuffle  operator  and  means: 

Do  A  IF  Result(B)  satisfies  the  condition  C 
and 

C(B)  =  |  A  belief(i/)  |  >  1  Si 

belief(H)  >  Precond(A)8 

That  is  do  B  then  do  A  only  if  the  result  of  B  is  a 
change  in  our  belief  of  Hypo\.  This  will  require  that 
we  note  Hypoi's  belief  level  before  taking  action  B 
and  after,  but  it  does  not  mean  that  we  must  exit 
this  phase  and  perform  other  computation.  What  we 
are  trying  to  do  is  order  our  applicable  actions  by  the 
dependencies  that  may  exist  between  them. 

We  are  ordering  by  features  (whose  value  we  can  easily 
access)  and  since  our  actoins  are  taken  to  modify  other 
features  (such  as  degree  of  belief)  we  can  also  express 
and  verify  the  conditions  in  the  dependencies.  We 
might  imagine  a  set  of  these  dependencies: 

{{A  B}  (CDE)  {FG}} 

Where  we  could  choose  to  carry  out  either  - 

•  One  of  the  sets 

•  All  of  the  sets,  in  order  or  randomly 

•  Dependencies  similar  to  those  inside  the  sets 

However,  inside  the  sets  themselves  it  would  appear 
that  the  only  meaningful  operation  would  be  some 
form  of  dependencies.  For  example,  we  have  the  abil¬ 
ity  right  now  in  MU  to  order  our  final  set  of  actions 
by  teature(s).  But  if  we  are  going  to  perforin  them  all 
anyway  what  is  point.  And  re-evaluating  our  phase 
state  isn’t  always  called  for.  What  we  want  is  to 
group  our  actions  into  meaningful  steps  which  indeed 
do  form  a  plan. 

Graphically  me  haue  something  like  this  - 

I  • ur  I  n~^~n 


Figure  9:  Ordering  of  action  sets 


In  the  strategy  mentioned  earlier,  we  might  first  want 
to  see  if  there  are  any  actions  that  have  high  diagnos- 
ticity  and  low  cost  and  ask  those  first.  If  they  do  not 
effect  our  belief  in  the  hypothesis  then  we  will  check 
to  see  if  the  example  situation  is  present.  We  may 
also  have  the  case  where  none  of  our  actions  differ  on 
some  feature.  We  may  choose  to  order  them  by  an¬ 
other.  For  instance  if  all  our  actions  have  the  same 
costs  we  may  decide  to  order  them  by  potential  diag- 
nosticity.  Then  if  one  of  them  succeeds  in  achieving 
its  maximum  potential  we  can  stop  because  we  know 
none  of  the  others  can  do  better,  (this  does  ignore  the 
fact  that  the  further  actions  might  give  us  conflicting 
negative  information). 

With  sets  of  possible  actions,  each  set  containing  some 
dependencies  internal  to  them,  and  possible  orderings 
existing  between  the  sets.  We  can  denote  these,  bor¬ 
rowing  a  bit  from  work  by  P.Bates  on  event  descrip¬ 
tions,  by 

#  -  Shuffle,  do  all  the  sets,  order  is  not  important 
’  -  Inord»r,  do  all  the  sets,  in  the  order  specified 
I  -  Do  one,  any  one  of  the  sets  -  presumably  if  you 
only  want  one  in  particular  you  know  what  it  is 
and  won’t  have  a  set 

:  -  Dependency,  do  one  set,  if  a  condition  is  met/not 
met  do  another 


Figure  10:  Dependencies  of  actions 

The  dependencies  in  a  set  are  shown  by  the  arrow, 
again  we  may  not  have  dependencies  between  the  ac¬ 
tions  we  decide  to  undertake,  but  when  this  is  the 
case  we  will  not  have  this  same  grouping  of  sets.  Our 
dependencies  are  of  the  form  - 

Do  A ,  if  the  result  of  A  is  such  and  such  then  either: 
Do  B 

Do  either  B  or  C 
Do  B  then  C 

So  a  possible  phase  could  have  the  following  instruc¬ 
tions  once  a  focus  has  been  found  and  some  prelimi¬ 
nary  filtering  of  actions  has  been  done. 
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1.  If  all  actions  are  equal  on  some  feature  like  cost, 
thev  could  be  ordered  by  another  feature,  diag- 
nosticity,  and  performed  with  dependencies  or  in 
the  usual  case  where  we  are  trying  to  increase  our 
belief  state  we  could  go  on  untii  our  state  was 
equal  or  greater  than  the  potential  state  achiev¬ 
able  by  the  remaining  actions  themselves. 

2.  If  some  actions  are  the  best  we  could  hope  for 
-  Cost(Z,low)  and  Diagnosticity(Z,  high)  then  we 
will  perform  these  actions.  If  this  isn’t  the  case  or 
if  after  this  we  still  have  actions  whose  potential 
is  greater  than  our  currtent  state  of  belief  then 
v  e  can  go  on  to  - 

3.  Identify  the  situations  mentioned  earlier  for 
which  we  have  a  tactic  for  organizing  a  plan. 

Suppose  we  have  three  sets  of  actions  - 
If  these  are  formed  of  a  pair  of  actions  whose  denen- 
dencies  are  determined  as  in  the  cost,  diagnosticity 
example;  we  can  express  several  plans  - 

{  :{set  A}  {set  B}  }  -  Do  set  A,  If  condition  (in  this 
case  increase  in  belief)  not  met  then  do  B 
Gives  AB  or  AA’  or  AA’B’ 

{  #{set  A}  {set  B}  }  -  Do  both  sets  but  order 
unimportant 

Gives  AA’  or  AA’B’  or  ABA’  or  ABA’B’  or  with 
primes  first 

{  ’{set  A}  {set  B}  }  -  Do  set  A  then  set  B 
Gives  the  first  case  of  example  above 
{  '.{set  A}  {set  B}  }  -  Do  set  A  or  set  B 
Gives  A,  AB  or  A’,  A’  B’ 

And  of  course.... 

{  :  {set  A}  {!{set  B}  {set  C}}  }  -  Do  set  A  and  if 
no  change  do  set  B  or  set  C 
{  :  {set  A}  {{set  B}  :  {set  C}}}  -  Do  A  if  no 
change  do  B  ii  no  change  do  C 

Most  likely  if  we  are  concerned  with  sets  for  one  hypo 
we  are  looking  at  - 

{:  {!  A  A’  A”}  {!  B  B’  B”}  }  -  Do  one  of  the 
first  set,  if  condition  met  do  one  of  the  second 
set 
or 

{:  {:  A  A’  A”}  {!  B  B’  B”}}  -  Do  A,  if  condition 
not  met  do  A’...  then  do  one  from  second  set 

In  fact  you  can  come  np  with  a  lot  of  improbable  com¬ 
binations,  however,  you  can  also  realize  some  rather 
practical  ones  you  hadn’t  considered  before.  The  mo¬ 
tivation  is  to  be  able  to  express  the  situations  for 
which  strategies  we  have  obtained  (or  made  up)  apply. 
We  do  this  by  writing  general  forms  with  variables  al¬ 
lowed  to  specify  - 

•  features  -  like  cost,  degree  of  belief 


•  the  applicable  actions  -  ekg,  streis  test 

•  the  hypothesis  under  consideration  to  which  the 
actions  apply 

Again,  for  example,  consider  forming  a  set  of  actions 
with  two  criteria  - 

VX  (F, (X,  Value,  Hypoy)  & 

F2(X,  Value,  Hypox)\ 

and  we  use  time  for  F\  and  short  for  its  value  and  cost 
for  F2  and  medium  or  less  for  its  value.  We  form  a  set 
of  actions  which  apply  -  {C  D  E  },  and  order  them  - 
{:  C  D  E}.  That  is,  do  C  if  condition  not  met  do  D... 
Now  we  make  another  set  by  testing  the  remaining 
actions  with  the  same  pattern  as  above  but  with  no 
cost  considerations.  This  becomes  {:  F  G}.  And  now 
we  order  the  two  sets  {:  {:  C  D  E}  {:  F  G}}.  Our 
strategy  says  we  are  interested  in  actions  which  have 
a  small  time  cost  associated  with  them  but  we  would 
still  like  to  consider  other  costs  initially.  If  these  initial 
actions  are  insufficient  then  we  are  willing  to  incurr 
any  of  these  other  costs  for  the  sake  of  short  time. 
There  are  some  possibly  large  computation  costs  here. 
We  have  several  hypos  under  consideration  at  one 
time  and  many  actions  can  affect  more  than  one  hypo. 
In  the  medical  domain  this  doesn’t  appear  to  be  too 
bad  a  problem  in  reality.  The  numbers  are  manage¬ 
able.  We  can  deal  with  one  hypo  at  a  time  when  con¬ 
structing  our  trees  and  we  can  identify  the  values  of 
the  features  in  question  in  a  straightforward  manner. 
Whether  the  small  numbers  are  true  in  other  domains 
(where  similar  grouping  of  actions  or.curr  as  tactics) 
remains  to  be  seen. 

We  can  however  have  the  situation 

Hypo_l  Hypo_2 

I  I 

{set  A>  {set  B> 

:E  F  :F  G 

Where  we  do  action  F  for  Hypo 2  after  deciding  not  to 
do  it  for  Uypoy  because  the  result  of  E  did  not  meet 
the  specified  condition  -  explanation  is  important. 

A  further  consideration  of  this  approach  is  to  use 
a  similar  representation  to  obtain  the  sets  of  hypos 
which  form  the  focus  also. 


VII.  Evaluation  of  the 
GRANT  System 

GRANT’S  evolution  from  a  small,  prototype  system 
(Cohen,  et  al.,  1985)  to  the  present  has  given  us  the 
opportunity  to  compare  performance  as  the  system 
has  been  scaled  up,  and  to  consider  the  potentials 
and  pitfalls  of  developing  other  GRANT-like  systems. 

This  section  discusses  a  battery  of  tests  on  the  current 
system. 

The  primary  measures  of  GRANT’S  performance  are 
recall  and  fallout  rate.  (A  third  statistic,  precision, 
is  1.0  -  fallout.)  Recall  is  the  percentage  of  all  the 
agencies  accepted  by  the  expert  that  GRANT  found, 
and  fallout  is  the  percentage  of  all  the  agencies  found 
by  GRANT  that  were  judged  good  by  GRANT  but 
bad  by  the  expert: 

recall  rate: 

rmm.  of  agencies  judged  good  by  GRANT,  good  by  expert 

num.  of  agencies  judged  good  by  expert 

fallout: 

num.  of  agencies  judged  good  by  GRANT,  bad  by  expert 

num.  of  agencies  judged  good  by  GRANT 

To  calculate  recall  and  fallout  for  a  proposal,  we  need 
to  generate  a  list  of  agencies  from  which  the  expert 
can  select  the  ones  that  are  likely  to  fund  the  pro¬ 
posal.  One  method  would  be  to  have  the  expert  rank 
all  700  agencies  in  the  network  for  each  proposal,  but 
this  would  be  exhausting.  Instead,  GRANT  is  run  in 
a  minimally-constrained,  spreading  activation  search 
that  reports  all  agencies  found  within  a  given  “dis¬ 
tance”  from  each  research  topic  in  the  proposal.  This 
is  called  breadth-first  (BF)  search9.  For  each  pro¬ 
posal,  we  first  run  a  BF  search  then  ask  our  expert  to 
classify  the  agencies  it  finds  as  good  or  bad.  Since  the 
search  is  blind,  many  of  the  agencies  are  bad;  that  is, 
unlikely  in  the  expert’s  judgment  to  fund  the  proposal. 

Then  we  run  GRANT  in  an  endorsment  constrained 
mode  called  EC  search,  avoiding  negatively-endorsed 
pathways  and  favoring  positively-endorsed  ones.  It 
finds  a  subset  of  the  agencies  discovered  by  BF  search. 
Ideally,  it  should  find  all  and  only  the  agencies  ranked 
as  good  by  the  expert,  but  in  practice  it  fails  to  find 
some  of  the  good  agencies  (called  misses)  and  finds 
some  bad  ones  (called  false  positives).  GRANT’S  miss 
rate  tends  to  be  very  low,  so  we  will  be  concerned  pri¬ 
marily  with  the  relationship  between  the  fallout  rate 
and  recall  rate. 

Completely  unconstrained  3F  search  finds  all  agencies  in  the 
network  each  by  dozens  of  different  paths,  and  requires  hours  of  CPU 
time  on  a  T!  Explorer  Lisp  Machine  The  data  presented  here  are 
for  a  modified  version  of  BF  search  that  avoids  nodes  with  extremely 
high  fan  out  and  prunes  paths  longer  than  4  links. 


The  following  tests  were  all  performed  on  a  set  of 
27  proposals,  representing  the  interests  of  a  diverse 
group  of  first-year  faculty  at  the  University  of  Mas¬ 
sachusetts.  The  first  test  was  designed  to  probe  the 
utility  of  endorsement-constrained  search.  We  com¬ 
pared  EC  and  BF  search  with  a  third  mode  called  un¬ 
constrained  keyword  search  (UKW).  It  finds  all  agen¬ 
cies  that  share  a  common  research  interest  with  a  pro¬ 
posal.  It  is  implemented  as  a  search  for  all  agencies 
exactly  2  links  distant  from  the  proposal.  For  exam¬ 
ple,  if  a  proposal  and  an  agency  share  the  common 
interest  dandelions,  then  each  will  be  linked  to  that 
node  by,  say,  a  SUBJECT  link.  The  two-link 
SUBJECT  :  dandelions  :  SUBJECT-OF 
path  connects  the  agency  and  the  proposal  via  the 
common  term  dandelion;  and,  in  general,  any  two- 
link  path  between  an  agency  and  a  proposal  indicates 
a  shared  term.  UKW  search  is  thus  a  simple  key¬ 
word  search,  since  it  finds  only  those  agencies  that 
share  terms  with  proposals.  The  relevant  statistics 
for  UKW,  EC,  and  BF  searches  are  shown  in  Table  1. 


UKW 

EC 

BF 

fallout  rate 

64% 

71% 

94% 

recall  rate 

44% 

67% 

100% 

number  of  agencies  found 

164 

406 

2145 

number  of  false  positives 

106 

207 

2013 

number  of  hits 

58 

88 

132 

number  correctly  rejected 

0 

111 

0 

Tabic  1.  Statistics  from 

UKW, 

EC, 

and  BF 

searches. 


EC  search  has  a  higher  recall  than  UKW  and  a  lower 
fallout  rate  than  BF.  Its  fallout  rate  is  typically  higher 
than  UKW  because  it  subsumes  UKW:  it  finds  all 
the  agencies  that  UKW  finds,  then  finds  some,  more 
by  exploiting  semantic  relations.  Let  us  consider  the 
utility  of  this  additional  search. 

Of  the  agencies  found  by  GRANT  for  the  27  test  cases, 
the  expert  thought  that  132  would  be  likely  to  fund 
their  respective  proposals.  UKW  found  just  44%  of 
these.  To  find  the  rest,  it  is  necessary  to  exploit  se¬ 
mantic  relationships  between  the  terms  used  in  re¬ 
search  proposals  and  agency  descriptions.  EC  search 
found  67%  of  the  agencies  judged  good  by  the  expert. 
It  found  242  more  agencies  than  UKW  search:  30  hits, 
101  false  positives,  and  111  correctly  rejected.  So  in 
the  regions  of  the  network  that  cannot  be  explored  by 
keyword  UKW  search,  EC  search  found  40%  of  the 
agencies  it  should,  and  incorrectly  accepted  101  agen¬ 
cies,  for  a  “marginal”  fallout  rate  of  42%  .  In  contrast, 
HI1'  search  found  almost  all  the  agencies  judged  good 
by  the  expert,  but  at  a  cost  of  a  94%  fallout  rate. 

In  practice,  GRANT’S  mode  of  operation  is  EC  search. 
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It  is  preferred  to  UKW  search  because  it  finds  more 
agencies,  and  to  BF  search  because  it  has  higher  pre¬ 
cision.  BF  search  finds  about  80  agencies  per  proposal 
at  a  precision  of  6%  —  only  1  agency  in  20  is  truly 
worth  pursuing.  EC  search  reports  fewer  agencies  (15 
per  proposal),  has  a  better  level  of  precision  (29%) 
than  BF  search,  and  has  an  acceptable,  intermediate 
recall  rate  (67%). 

Since  EC  search  subsumes  UKW  search,  it  also  in¬ 
herits  a  significant  fallout  rate.  The  fallout  rate  for 
agencies  found  by  keyword  UKW  search  is  64%,  but 
the  marginal  rate  for  those  agencies  found  by  addi¬ 
tional  semantic  matching  is  just  42%.  Clearly,  path 
endorsements  can  increase  precision.  But  their  utility 
is  obscured  to  some  extent  by  the  fact  that  EC  search 
“starts  off’  with  the  106  false  positives  found  by  UKW 
search.  With  this  proviso  stated,  we  now  explore  how 
to  increase  the  recall  and  prec!«ion  of  EC  search. 

Our  experiments  are  designed  to  address  two  general 
hypotheses. 


•  GRANT’S  performance  is  due  to  its  path  endorse¬ 
ments. 

•  GRANT’S  performance  is  affected  by  the  struc¬ 
ture  of  its  network,  including  the  lengths  of  path¬ 
ways  between  proposals  and  agencies,  and  the  de¬ 
gree  of  interconnection  between  nodes. 

A  third  hypothesis  is  that  GRANT’S  performance  is 
affected  by  how  its  language  of  links  is  used  to  encode 
the  interests  of  agencies.  Since  many  people  worked 
on  GRANT’S  knowledge  base,  we  were  concerned  that 
knowledge  was  encoded  inconsistently.  We  calculated 
several  statistics  that  measure  consistency,  but  we  did 
not  find  significant  or  even  suggestive  correlations  of 
these  measures  with  fallout  rates.  We  cannot  conclude 
that  inconsistencies  have  no  affect  on  GRANT’S  per¬ 
formance,  because  our  measures  of  consistency  may 
not  be  sufficiently  sensitive.  But  we  have  found  much 
stronger  evidence  for  the  other  two  hypotheses. 
Structural  Factors  in  Recall  and  Precision.  We 
first  calculated  the  recall  and  fallout  rates  as  a  func¬ 
tion  of  the  distance  between  proposals  and  agencies  in 
EC  search  (Table  2).  As  noted,  at  distance  =  2  EC 
has  the  same  fallout  rate  as  UKW  search,  which  finds 
all  agencies  within  two  links  of  the  proposal.  Extend¬ 
ing  the  search  one  more  link  increases  the  recall  rate 
substantially  (from  42%  to  70%  )  and  also  raises  the 
fallout  rate  somewhat.  Interestingly,  extending  the 
search  further  has  almost  no  affect  on  the  recall  rate 
but  does  increase  the  fallout  rate.  This  suggests  that 
endorsement-constrained  search  as  implemented  here 
offers  most  advantage  when  finding  agencies  based  on 
a  single  semantic  relationship  between  a  term  used 
in  the  proposal  and  a  term  used  in  the  agency  de¬ 
scription.  Increased  fallout  limits  the  utility  of  longer 
chains  of  relations. 


length 

fallout 

recall 

rate 

rate 

less  than  3 

64 

42 

less  than  4 

73 

70 

less  than  5 

78 

69 

Table  2.  Recall  and  fallout  rates  for  searches 
along  pathways  of  different  lengths. 


The  structural  feature  of  GRANT’S  network  that  ac¬ 
counts  for  most  variance  in  recall  rate  and  fallout  rate 
is  the  branching  factor  of  nodes,  that  is,  the  number  of 
links  that  connect  nodes.  In  an  experiment  reported 
in  (Kjeldsen  and  Cohen,  1987)  we  found  that  the  fall¬ 
out  rate  was  correlated  with  the  average  branching 
factor  of  pathways  to  agencies.  Average  branching 
factor  is  the  average  of  the  number  of  links  emanating 
from  each  node  on  a  pathway.  It  is  a  measure  or  the 
“density”  of  the  network  in  the  vicinity  of  the  path¬ 
way.  We  expected  dense  areas  of  the  network  to  have 
low  fallout  rates  relative  to  recall  rates,  since  there 
are  more  nodes  per  agency  in  dense  areas,  and  thus 
more  basis  for  discriminating  good  agencies  from  bad 
ones.  Table  3  shows  the  percentage  of  the  false  pos¬ 
itives  found  along  pathways  with  low,  medium,  and 
high  branching  factors. 


EC  Search 
%  hits 

%  false  positives 


average  branching  factor 
2-7  8-15  >16 

20.3  40.6  39.1 

8.4  36.9  51.6 


UKW  Search  average  branching  factor 
2-7  8-15  >16 

%  hits  30.7  55.1  14.1 

%  false  positives  8.4  37.3  51.8 

Table  3.  Hits  and  false  positives  for  EC  and 
UKW  search,  distributed  by  average  branching 
factor. _ 

Contrary  to  our  expectations,  the  majority  of  false 
positives  were  associated  not  with  low  branching  fac¬ 
tors  but  rather  with  high  ones.  For  EC  search,  54% 
of  the  false  positives  were  found  on  paths  with  an  av¬ 
erage  branching  factor  greater  than  16.  For  UKW 
search,  51%  of  the  false  positives  were  associated  with 
high  branching  factor;  furthermore,  only  14%  of  the 
hits  were  found  in  these  areas.  We  looked  at  the  test 
cases  individually  to  try  to  explain  this  result.  Many 
of  the  false  positives  were  associated  wit  h  nodes  with 
high  fan-out,  such  as  “animal”  and  “location.”  We  be¬ 
lieve  that  such  nodes  are  relatively  general  that  their 
fan-out  is  due  to  their  many  specializations.  To  say  an 
agency  is  associated  with  one  of  these  general  nodes  is 
to  say  very  little  about  its  interests,  so  agencies  found 
via  these  nodes  are  more  likely  to  be  false  positives. 
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These  data  seem  to  suggest  that  we  could  increase 
GRANT’S  precision  by  pruning  agencies  associated 
with  general  nodes.  In  fact,  this  is  an  artifact  of  the 
way  we  calculate  precision.  We  could  certainly  reduce 
the  number  of  false  positives  this  way,  but  we  would 
also  reduce  the  number  of  agencies  GRANT  finds,  and 
so  would  have  little  effect  on  the  fallout  rate.  More¬ 
over,  since  the  denominator  of  the  recall  rate  is  con¬ 
stant  —  the  number  of  agencies  judged  good  by  the 
expert  pruning  agencies  can  only  reduce  the  recall 
rate.  Clearly,  false  positives  are  associated  with  higher 
branching  factors.  However,  the  key  to  improving 
precision  is  not  to  prune  agencies,  but  to  restructure 
the  network  so  that  it  has  fewer  pathways  with  high 
branching  factors,  that  is,  fewer  nodes  that  represent 
very  general  concepts.  For  example,  the  current  net¬ 
work  defines  dandelion  and  tomato  plant  as  instances 
of  the  plant  node,  though  they  are  obviously  differ¬ 
ent  kinds  of  plants.  The  distinction  could  be  made 
by  defining  dandelion  as  an  instance  of  a  weed  and 
tomato  plant  as  a  domestic  plant,  but  because  these 
nodes  are  not  in  the  network,  the  fan-out  of  plant  is 
higher  than  it  should  be  and  dandelion  and  tomato 
plant  are  not  adequately  discriminated. 

The  statistics  in  Table  3  suggest  that  the  “ideal” 
branching  factor  is  less  than  16.  Another  experi¬ 
ment  was  needed  to  pinpoint  the  ideal  more  precisely. 
Starting  with  the  list  of  agencies  found  by  the  EC 
search  and  reported  in  Table  1,  we  ranked  the  agen¬ 
cies  by  their  branching  factors,  and  recalculated  the 
recall  rate  and  fallout  .ate  for  each  successive  level 
of  the  ranking.  That  is,  we  superimposed  a  ranking 
by  branching  factor  on  the  list  of  agencies  found  by 
EC  search  and  asked  about  the  recall  rate  and  fall¬ 
out  rate  of  all  agencies  that  had,  first,  low  branching 
factor,  then  those  that  had  higher  branching  factor, 
and  so  on.  (For  reasons  discussed  below,  we  used  the 
branching  factor  of  the  last  node  on  a  pathway  in¬ 
stead  of  the  average  branching  factor  over  all  nodes 
on  a  pathway.)  The  results  are  shown  in  Table  4. 


These  data  suggest  that  disproportionate  numbers  of 
false  positives  are  associated  with  low  and  moderately 
high  branching  factors.  At  the  lowest  level  (branching 
factor  of  3  or  less)  there  are  few  false  positives  (26)  and 
hits  (20)  because  few  nodes  have  such  low  branching 
factors.  At  the  next  level  we  consider  agencies  found 
via  nodes  with  branching  factor  of  7  or  less.  47  are 
false  positives,  an  increase  of  81%,  and  25  are  hits, 
an  increase  of  25%.  Thus,  fallout  rate  increases  faster 
than  recall  rate  for  nodes  with  relatively  low  branching 
factors.  When  nodes  with  higher  branching  factors 
(10  or  less)  are  considered,  fallout  rate  increases  by 
157%  and  recall  rate  by  a  comparable  140%.  However, 
adding  agencies  that  are  found  by  nodes  at  the  next 
level  of  branching  factor  (13  or  less)  increases  fallout 
rate  by  55%  but  increases  recall  rate  by  only  15%.  The 
rates  then  increase  proportionately  for  higher  levels  of 
branching  factor. 

The  greatest  increase  in  recall  and  fallout  occurs  when 
we  add  the  agencies  found  via  nodes  with  branching 
factors  between  8  and  10.  Moreover,  the  numbers  of 
hits  and  fallouts  increase  by  roughly  the  same  amount 
in  this  area  (about  150%).  In  contrast,  false  positives 
increase  more  rapidly  than  hits  at  low  (3  -  7)  and  mod¬ 
erately  high  (11  -  14)  branching  factors.  This  suggests 
that  the  “ideal”  branching  factor  is  between  8  and  10, 
and  supports  the  hypothesis  that  recall  and  fallout 
rate  are  correlated  with  the  generality  -  as  measured 
by  branching  factor  -  of  nodes.  As  mentioned  above, 
we  used  the  branching  factor  of  the  last  node  on  a 
pathway  -  the  one  “nearest"  to  the  agency  and  “fur¬ 
thest”  from  the  proposal  -  to  produce  the  data  in 
Table  4.  We  reasoned  that  very  specific  nodes,  those 
with  low  branching  factor,  would  rarely  be  part  of  an 
agency  description,  and  so  would  not  be  associated 
with  many  hits.  On  the  other  hand,  as  we  argued 
above,  nodes  with  very  high  branching  factors  are  too 
general  to  represent  the  interests  of  an  agency  un¬ 
ambiguously,  and  so  would  be  associated  with  high 
fallout  rates. 


Agency  is  counted 
as  “good”  if  the 
branching  factor 


is: 

fallout 

recall 

number 

%  change 

number 

%  change 

rate 

rate 

of  FI’s 

number  of 

of  hits 

number  of 

FPs 

hits 

any  number 

73 

63 

219 

2 

82 

I 

16  or  less 

73 

62 

215 

14 

81 

17 

1 3  or  less 

73 

53 

188 

55 

69 

15 

1 0  or  less 

67 

46 

121 

157 

60 

1  10 

7  or  less 

66 

19 

17 

81 

25 

25 

3  or  less 

58 

15 

26 

20 

Table  4.  Fallout  and  recall  rates  from  ranking  agencies  by  branching  factor. 
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The  primary  implication  of  these  results  is  that  knowl¬ 
edge  engineers  for  GRANT-style  systems  should  en¬ 
sure  that  the  definitions  of  new  terms  are  as  specific  as 
possible.  For  example,  the  knowledge  engineer  shou’d 
define  a  new  plant  in  terms  of  the  most  specific  pcs- 
sible  subclass  of  plants,  or  perhaps  c  eate  a  new  sub¬ 
class,  rather  than  linking  the  new  plant  to  the  gen¬ 
eral  plant  node.  Currently,  GRANT  is  programmed 
to  avoid  nodes  with  extremely  high  fan-out.  An  al¬ 
ternative  would  be  to  alert  the  knowledge  engineer  to 
them  during  the  development  of  the  knowledge  base, 
to  fix  the  problem  before  it  arises.  Then,  any  remain¬ 
ing  nodes  with  high  fan-out  almost  certainly  denote 
concepts  that  are  too  general  to  be  useful,  and  en¬ 
dorsements  could  be  designed  to  avoid  them,  or  to 
give  them  a  low  rank. 

Endorsements  as  Factor  in  Recall  and  Preci¬ 
sion.  Our  second  hypothesis  is  that  although  the 
representation  language  for  the  network  is  probably 
sufficient  to  encode  the  meaning  of  research  proposals 
and  agency  descriptions,  these  representations  are  not 
being  exploited  by  endorsement-constrained  search. 
Several  findings  support  this  hypothesis.  In  (Kjeldsen 
and  Cohen,  1987)  we  reported  that  just  three  path  en¬ 
dorsements  accounted  for  85%  of  the  hits  but  the  same 
three  led  to  42%  of  the  false  positives.  The  culprits 
were: 

•  SUBJECT  :  SUBJECT-OF 

•  SUBJECT  :  SUBJECT-OF  :  SURJECT-OF 

•  OBJECT  :  SUBJECT-OF 


Despite  the  fact  that  48  distinct  relations  are  used  in 
the  network  to  connect  concepts,  just  3  (SUBJECT, 
OBJECT,  and  SUBJECT-OF)  were  sufficient  to  find 
the  majority  of  hits  and  a  sizeable  portion  of  false- 
positives.  This  is  partly  due  to  the  relative  frequency 
of  these  links  in  the  network:  they  are  very  common 
and  so  support  a  disproportionate  number  of  path 
traversals.  However,  our  data  suggest  that  the  re¬ 
liance  on  these  links  is  not  due  entirely  to  their  fre¬ 
quency,  and  that  intelligent  use  of  other  links  could 
increase  recall  rate. 

We  measured  the  frequency  with  which  different  links 
were  used  to  represent  agency  descriptions.  These 
data  are  shown  in  Table  5.  As  expected,  SUB¬ 
JECT,  OBJECT,  and  FOCUS  were  most  common, 
but  WIIO-FOR  and  LOCATION  were  not  infrequent. 


Link 

Number  of  uses  in 
agency  definitions 

Number  of  uses  as 
last  link  of  endorsements 

subject 

513 

19 

object 

258 

10 

focus 

238 

17 

who-for 

124 

2 

location 

80 

0 

dv 

30 

8 

iv 

20 

5 

rv 

18 

5 

Table  5.  Number  of  times  each  link  is  used  to 
define  agency  interests,  and  number  of  times  it 
is  the  final  link  in  an  endorsement. _ . 


Agency  is  counted 
as  “good”  if  it 
is  found  by  an 
endorsement 
classified  as: 


fallout 

rate 

recall 

rate 

number 
of  FPs 

%  change 
in  number 
of  FPs 

number 
of  hits 

%  change 
number  of 
hits 

very-likely 

55 

18 

28 

425% 

23 

78% 

likely  or 
very-likely 

73 

42 

147 

41% 

54 

59% 

maybe,  likely, 
or  very-likely 

71 

67 

207 

4% 

86 

0% 

unlikely,  maybe, 

72 

67 

216 

86 

likely,  or 
very-likely 


Table  0.  Fallout  and  recall  rates  from  ranking  agencies  by  final  link. 


However,  these  latter  links  were  almost  never  tra¬ 
versed  to  find  agencies:  Table  6  shows  the  results 
of  using  the  last  link  in  a  pathway  (the  one  closest 
to  the  proposal)  to  rank  the  agencies  found  by  EC 
search.  If  SUBJECT  and  OBJECT  are  the  only  links 
that  GRANT  is  allowed  to  traverse,  then  it  finds  74 
hits  and  179  false  positives.  It  finds  an  additional 
15  hits  when  it  is  also  allowed  to  traverse  FOCUS. 
But,  remarkably,  allowing  it  to  traverse  any  link  re¬ 
sults  in  only  2  more  hits:  Most  of  GRANT’S  hits  are 
found  by  following  SUBJECT,  OBJECT,  and  FOCUS 
iinks  into  an  agency.  Although  WHO-FOR  and  LO¬ 
CATION  are  used  quite  often  to  define  the  interests  of 
agencies,  they  are  not  used  to  find  the  agencies.  This 
is  not  surprising,  since  WHO-FOP  and  LOCATION 
are  the  final  link  in  only  2  path  endorsements.  But 
it  does  suggest  that  using  these  and  other  links  judi¬ 
ciously  could  increase  GRANT’S  recall  rate.  In  gen¬ 
eral,  these  results  stress  that  path  endorsements  must 
reflect  the  conventions  for  representing  concepts. 

To  gef  a  more  complete  picture  of  the  utility  of 
GRANT’S  path  endorsements  we  would  perform  “ab¬ 
lation  studies”  —  removing  path  endorsements  one 
at  a  time  to  see  how  they  affect  recall  and  precision. 
Unfortunately,  an  exhaustive  analysis  of  all  endorse¬ 
ments  would  require  weeks  of  computer  time.  Instead, 
we  grouped  the  path  endorsements  and  assessed  the 
effects  on  performance  of  removing  these  classes.  Ev¬ 
ery  path  endorsement  is  assigned  to  one  of  five  classes 
that  reflects  the  subjective  probability  that  an  agency 
found  by  that  endorsement  would  fund  the  proposal. 
The  classes  are  trash,  unlikely,  maybe,  likely,  and 
very-likely.  We  used  these  classes  to  .-nk  as  “good” 
or  “bad”  the  agencies  found  by  EC  search,  then  re¬ 
calculated  recall  and  fallout  rates  for  each  rank.  The 
results  are  shown  in  Table  7. 


Agency  is  counted 
as  “good”  if  the 
last  link  in  a 


pathway  is:  j 

fallout 

recall 

number 

number 

SUBJECT  or 

rate 

rate 

of  FPs 

of  hits 

OBJECT 

71 

57 

179 

74 

SUBJECT,  OBJECT, 

or  FOCUS 

72 

68 

228 

89 

ANY  LINK 

73 

70 

251 

91 

Table  7.  Fallout  and  recall  rates  from  ranking 
agencies  by  class  of  path  endorsements. 


When  only  very-likely  endi  rsements  are  allowed,  the 
numbers  of  hits  and  false  positives  are  low  (2S  and  28, 
respectively).  Adding  in  agencies  that  are  found  via 
paths  with  likely  endorsements  increases  the  number 
of  f&’se  positives  by  over  400%  to  147.  This  seems  an 
excessive  price  to  pay  for  the  78%  increase  (from  23 
to  54)  in  the  number  of  hits.  In  contrast,  adding  in 
agencies  with  maybe  endorsements  increases  the  num¬ 
ber  of  hits  by  59%  and  increases  false  positives  by  a 
significantly  lower  amount  41%  .  (The  main  reason 
for  the  increase  in  recall  is  that  FOCUS  links  are  used 
in  a  preponderance  of  maybe  endorsements,  and  are 
infrequently  used  in  likely  or  very-likely.  We  saw  in 
Table  5  that  the  FOCUS  link  is  used  frequently  in 
defining  agencies,  and  in  Table  6  that  inclusion  of  the 
FOCUS  link  incrcas'  i  GRANT’S  recall  rate.) 

Clearly,  GRANT’S  fallout  rate  uld  be  improved  by 
refining  its  likely  endo.'sements.  Lhe  improvement  in 
performance  due  to  adding  maybe  endorsements  — 
specifically  those  dealing  with  FOCUS  links  — -  con¬ 
vinces  us  that  it  is  possible  to  add  endorsements  that 
will  increase  recall  and  precision  simultaneously.  Ta¬ 
ble  5  suggests  that  these  endorsements  should  exploit 
WHO-FOR  and  LOCATION  links,  which  are  used  to 
define  agencies  but  are  rarely  traversed  to  find  them. 
We  are  currently  designing  new  endorsements,  though 
they  will  have  to  be  tested  on  a  new  set  of  proposals  to 
ensure  that  they  are  not  simply  “tuned”  to  the  current 
test  cases. 

’he  main  conclusion  of  our  work  is  that  constrained 
preading  activation  finds  agencies  based  <m  semantic 
elations,  with  reasonable  recall  ai  d  precision,  that 
vould  not  be  found  by  simple  te  word  search.  From 
i  pragmatic  standpoint,  the  Cuke  of  Research  Affairs 
it  the  University  of  Massachusetts  prefers  GRANT  for 
several  reasons  to  the  database  program  that  it  used 
previously.  GRANT  is  more  efficient.  A  session  takes 
just  a  few  minutes:  the  proposal  Is  coded,  GRANT 
runs  a  search,  a  list  of  15  agencie'  (on  average)  is  re¬ 
turned,  and  the  user  sorts  them  t  ;ind  2  or  3  that 
are  ideal  for  the  client.  In  contrast,  a  similar  search 
takes  about  2  hours  with  the  old  keyword  database 
system,  in  part  because  the  dozens  of  agencies  re¬ 
turned  by  the  old  system  must  be  carefully  sorted  (its 
precision  is  only  about  5%).  GRANT’S  performance  is 
well-suited  to  the  funding  domain  because  researchers 
rarely  send  a  proposal  to  many  agencies,  but  several 
agencies  will  typically  fund  a  piece  of  research.  Thus, 
GRANT’S  relatively  low  precision  (29%)  is  not  bother¬ 
some  because  a  search  returns  relatively  few  agencies 
ample  to  find  2  or  3  for  the  client  but  few  cnou’h 
to  sort  quickly.  And  since  a  proposal  can  potentially 
be  funded  by  several  agencies,  GRANT’S  recall  rate 
(67%)  is  sufficient  to  find  enough  good  candidates  for 
the  user. 
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GRANT  was  designed  to  have  the  advantag  s  '■<  ,iu- 
man  associate  memory  but  to  be  more  reliabic.  it  is 
difficult  to  evaluate  any  system  on  such  vague  criteria, 
but  the  experiences  of  GRANT’S  users  are  suggestive. 
At  first,  they  expected  GRANT  to  accelerate  their 
processing  of  “easy”  cases.  They  found  instead  that 
easy  cases  were  those  that  could  be  answered  from 
memory,  and  that  GRANT  is  most  useful  for  difficult 
cases  —  those  for  which  no  agencies  come  to  mind. 
Apparently,  GRANT’S  associative  memory  fines  plau¬ 
sible  semantic  connections  between  topics  in  proposals 
and  agencies  that  human  funding  advisors  either  for¬ 
got  or  never  knew. 

We  are  considering  other  applications  of  constrained 
spre- aing  activation.  A  straightforward  extension  of 
GRANT  is  to  run  tire  system  “backwards,”  taking  as 
input  an  agency’s  request  for  proposals  (RFP)  and 
searching  for  the  appropriate  faculty  members  to  re¬ 
ceive  the  RFP.  The  research  interests  of  many  of  the 
faculty  at  the  University  of  Massachusetts  have  been 
encoded  for  this  purpose.  Another  goal  is  an  intelli¬ 
gent  index  for  a  major  reference  book,  since  GRANT 
is  adept  at  inferences  of  the  form  “if  a  researcher  (or 
reader)  is  interested  in  topic  X  then  he  or  she  is  likely 
to  be  interested  in  a  related  topic  Y.”  Other  poten¬ 
tial  applications  are  literature  search  and  searching 
databases  of  news  wire  services. 

Although  constrained  spreading  activation  is  a  simple 
algorithm,  and  seems  widely  applicable,  the  invest¬ 
ment  required  to  build  GRANT-like  systems  is  sub¬ 
stantial.  Five  steps  are  involved.  First,  one  must 
analyze  the  domain  to  design  a  language  for  repre¬ 
senting  the  domain’s  concepts  and  their  interrelation¬ 
ships.  Concepts  in  GRANT’S  network  are  linked  by 
24  different  relationships  and  their  inverses.  We  had 
to  interview  an  expert  funding  advisor  at  length  to  ac¬ 
quire  this  vocabulary  of  links.  Second,  a  network  must 
be  constructed  to  represent  and  index  the  targets  of 
search,  be  they  agencies,  bibliographic  references,  or 
people.  Roughly  4  person-months  of  effort  were  re¬ 
quired  to  build  GRANT’S  4500-node,  700-agency  net¬ 
work.  Third,  path  endorsements  must  be  formulate  1 
Fourth,  the  system  must  be  tested  and  the  pa'ii  q. 
dorsements  refined.  Finally,  for  most  interesting  do¬ 
mains,  one  will  be  constantly  updating  information 
about  the  targets  of  search,  adding  new  ones,  modify¬ 
ing  the  descriptions  of  old  ones,  and  so  on. 
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ABSTRACT:  Expert  systems  still  lack  the  skill 
of  an  expert  when  it  conies  to  providing 
explanations  of  the  results  of  expert  reasoning. 
This  is  because  while  such  systems  may 
implement  knowledge  which  is  sufficient  to 
mimic  the  performance  of  an  expert,  they  do 
not  necessarily  model  that  expert’s 
understanding  of  a  problem  domain.  Such  a 
model  must  include  knowledge  of  that  domain’s 
terminology,  knowledge  of  domain  facts,  and 
knowledge  of  problem  solving  methods.  The 
Explainable  Expert  Systems  project  has  been 
exploring  a  new  paradigm  for  expert  system 
development  that  is  intended  to  capture  such 
missing  knowledge  and  make  it  available  for 
explanation.  This  paper  will  discuss  the 
principles  behind  this  par.  digm  and  consider 
two  systems  which  have  been  subsequently 
developed. 


1.  Introduction 

Being  an  expert,  even  in  a  single  domain,  requires  more 
than  one  kind  of  expertise.  In  testing  or  validating  an 
expert  system,  we  focus  on  performance  expertise,  the 
knowledge  required  to  get  the  right  answer,  or  more 
precisely,  to  get  the  same  answer  as  a  human  expert.  But 
we  require  more  of  human  experts  than  just  getting  (he 
right  answer.  They  must  be  able  to  explain  and  justify 
their  answers,  acquire  new  knowledge  from  a  variety  of 
sources,  and  learn  from  experience.  Focussing  on 
explanation,  in  this  paper  we  will  describe  how  examining 
the  problem  of  producing  explanations  of  expert  system 
behavior  has  led  us  to  a  better  model  of  the  kinds  of 
expertise  that  an  expert  system  should  possess.  This 
includes  knowledge  of  terminology,  which  is  knowledge  of 
domain  concepts  and  how  they  are  defined,  knowledge  of 
domain  descriptive  facts,  which  is  knowledge  that 
describes  the  structure  of  the  domain  and  relationships 
among  entities  within  it,  and  knowledge  of  problem 
solving  methods. 


1  This  paper  has  been  submitted  to  the  Journal  of  Expert  Systems 


For  most  expert  system  projects,  the  primary  concern  is 
to  represent  the  knowledge  needed  for  the  expert  system 
to  produce  solutions  similar  to  ones  produced  by  human 
experts.  At  one  time,  it  was  thought  that  if  that 
knowledge  was  represented  in  a  sufficiently  declarative 
form  it  would  be  relatively  easy  to  produce  explanations 
of  the  system’s  behavior  by  paraphrasing  the  rules  or 
traces  of  their  execution  into  natural  language.  The  fact 
that  the  expert  system  could  mimic  the  expert’s  behavior 
was  evidence  that  the  expert’s  knowledge  had  been 
captured,  so  explanation  merely  required  parroting  back 
the  knowledge  base  to  the  user  in  a  palatable  form. 

This  paraphrase-the-eode  approach  could  give 
reasonable  explanations  of  liow  the  system’s  problem 
solving  methods  worked  or  were  applied  lo  a  particular 
problem.  However,  the  problem  was  that  the  range  of 
questions  that  people  may  reasonably  ask  of  an  expert 
goes  beyond  questions  about  how  a  solution  was 
obtained.  Answering  these  questions  requires  additional 
knowledge  over  and  above  the  knowledge  needed  to 
produce  the  solution. 

We  will  illustrate  the  limitations  of  the  parapharse-the- 
code  approach  with  an  example  from  an  early  version  of 
the  Digitalis  Therapy  Advisor  [Swartout  77].  Digitalis  is 
a  drug  given  to  many  cardiac  patients.  It  is  a  difficult 
drug  to  give  properly  and  considerable  expertise  is 
required  for  proper  administration.  One  complicating 
feature  is  that  certain  physiological  abnormalities  in  a 
patient  may  make  him  unusually  sensitive  to  the  drug. 
These  sensitivities  must  be  taken  into  account  to  reduce 
the  risk  of  accidental  overdose. 

In  Figure  1-1  the  physician  who  is  using  the  system 
wants  to  know  why  the  system  needs  to  know  the 
patient’s  serum  calcium  level.  The  system  responds  by 
describing  those  activities  it  is  tryin0  to  perform.  While 
this  explanation  does  suggest  that  scrum  calcium  is  some 
sort  of  a  sensitivity,  in  all  probability',  it  would  not  be 
considered  acceptable  if  it  were  given  by  an  expert 
physician.  It  would  probably  provoke  certain  follow-on 
questions  which  are,  in  fact,  beyond  the  scope  of  this 
explanation  facility. 

To  better  understand  the  types  of  questions  that  should 
be  offered,  and  the  expertise  that  must  be  modelled  to 


ENTER  NUMERICAL  VALUE  ==>  why? 

The  user  types  "why?”  indicating  that  he 
questions  why  he  is  being  ashed  to  supply 
the  value  of  serum  calcium. 

WHAT  IS  THE  LEVEL  OF  SERUM 
CALCIUM? 

MY  TOP  GOAL  IS  TO  BEGIN  THERAPY. 

ONE  STEP  IN  DOING  THVT  IS  TO  CHECK 
SENSITIVITIES.  I  AM  NOW  TRYING  TO 
CHECK  SENSITIVITY  DUE  TO  CALCIUM. 

Figure  1-1:  A  Limited  Justification  for  a  Question 

answer  them,  we  took  protocols  of  potential  end-users 
interacting  with  expert  systems.  The  users  were 
encouraged  to  ask  questions  during  the  interaction.  No 
formal  analysis  of  the  protocols  wa.s  undertaken,  hut 
based  on  them  and  our  own  experience  as  expert  system 
builders,  we  identified  approximately  a  dozen  different 
classes  of  useful  explanations  (see  [Swartout  86]).  We  will 
focus  on  three  types  of  questions  here: 

1.  Justifications 

•  Why  is  serum  calcium,  an  important 
factor  in  digitalis  administration? 

2.  Questions  about  terminology  of  the  domain 
and  its  definition. 

•  What  is  a  "sensitivity?" 

3.  Questions  about  the  intent  behind  a  goal. 

•  Wliat  does  it  mean  to  perform  a 
diagnosis  ? 

•  What  does  it  mean  to  check 
sensitivities? 

There  are  several  reasons  why  it  is  important  for  an 
expert  system  to  be  able  to  answer  questions  such  as 
these.  First,  a  user  is  more  likely  to  accept  an  expert 
system’s  recommendations  if  he  can  assure  himself  that 
the  system’s  reasoning  is  based  on  a  sound  understanding 
of  the  underlying  principles  of  the  domain.  Second,  the 
answers  to  these  questions  can  help  a  user  understand 
how  closely  his  understanding  of  the  domain  agrees  with 
the  system’s.  If  there  is  a  wide  disparity,  that  can  serve 
as  a  warning  that  the  expert  system  may  be  being  pushed 
beyond  the  bounds  of  its  capabilities.  Third,  these 
explanations  may  help  educate  inexperienced  users  about 
the  fundamentals  of  the  expert  system’s  domain. 

The  reason  why  these  and  similar  questions,  whose 
answers  are  so  natural  to  an  expo  *re  so  problematical 
for  an  expert  system  is  because  :  ;..st  explanations,  like 
that  in  Figure  1-1,  are  based  on  knowledge  that  is  only 
sufficient  to  mimic  the  performance  of  an  expert,  or 
traces  of  the  results  of  the  application  of  such  knowledge. 


Thus,  explanations  can  be  provided  of  how  a  method 
works,  or  was  applied  in  a  particular  setting.  However, 
no  account  can  be  given  for  why  such  activities  have 
occurred  or  why  the  system  is  trying  to  achieve  them. 
Also,  the  terms  that  the  system  employs  and  the  intent 
behind  its  goals  cannot  be  defined  because  no  explicit 
definition  is  provided  for  them. 

What  has  happened  to  the  information  required  to  deal 
with  such  questions?  The  knowledge  that  is  needed  to 
answer  them  is  known  by  a  system  builder  at  the  time  he 
creates  an  expert  system  and  is  used  by  him  in  the 
process  of  deriving  the  expert  system’s  rule  or  methods. 
But  because  that  knowledge  is  not  needed  for  the  expert 
system  to  perform  properly,  it  does  not.  appear  in  the 
rules  or  methods  of  the  expert  system  itself,  and  hence  is 
unavailable  for  explanation. 

In  the  Explainable  Expert  Systems  (EES)  project,  we 
have  been  exploring  a  new  paradigm  for  expert  system 
development  that  is  intended  to  capture  such  missing 
knowledge  and  make  it  available  for  explanation.  In  our 
approach,  system  builders  and  domain  experts  collaborate 
to  construct  a  high-level  representation  of  knowledge  in 
the  domain  that  includes  the  normally  missing  knowledge 
that  forms  the  basis  for  an  expert  system’s  rules  or 
methods.  Expert  behavior  is  then  derived  automatically 
from  this  knowledge  base.  A  trace  of  the  derivation 
process  is  left  behind.  This  trace  connects  the  behavior 
of  the  system  with  the  additional  knowledge  required  to 
satisfy  the  needs  of  explanation. 

In  the  remainder  of  this  paper,  we  present  two  systems 
that  follow  this  approach.  In  Sections  2  and  3  we  discuss 
EES  version  I,  which  explicitly  separates  knoudedge  of 
how  the  domain  works,  knowledge  of  problem  solving, 
and  knowledge  of  terminology.  From  this  high-level 
knowledge  Dase,  an  automatic  programmer  derives 
performance-level  rules  or  methods  of  the  sort  found  in 
conventional  expert  systems  (see  Figure  1-2).  The 
derivation  process  is  recorded  in  a  machine-readable 
development  history,  and  that  recorded  trace  is  used  lo 
provide  the  normally  missing  knowledge  needed  for 
explanations  that  reflect  not  only  the  system’s 
performance  leved  knowledge  but  also  the  support 
knowledge  underlying  it. 

In  Section  4  wc  discuss  EES  version  II.  This  system 
embodies  a  more  explicit  representation  of  the  relation 
between  problem  solving  knowledge  and  domain 
descriptive  knowledge.  Domain  specific  problem  solving 
knowledge  is  derived  directly  from  domain  descriptive 
knowledge,  so  that  problem  solving  methods  may  be 
explained  in  terms  rf  the  domain  knowledge  that 
underlies  them.  We  have  also  added  "weak  methods," 
which  are  used  to  represent  domain  independent  problem 
solving  knowledge. 
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Specific  issues  concerned  with  the  generation  of 
explanations  are  discussed  in  Section  5.  This  task 
imposes  additional  contraints  on  the  structure  of  our 
knowledge  base,  and  we  consider  how  these  constraints 
may  be  satisfied. 

2.  A  Declarative  Representation  for  Expertise: 

The  Knowledge  Base 

In  constructing  an  expert  system  using  this  approach, 
the  first  step  is  to  understand  what  kinds  of  knowledge 
or  expertise  need  to  be  represented  and  how  that 
knowledge  should  be  partitioned.  Relying  again  on  out- 
study  of  question  types,  we  found  three  important  kinds 
of  expertise  to  represent: 

•  Terminological  Knowledge  is  knowledge  of  the 
concepts  and  relationships  of  a  domain  that 
experts  use  to  communicate  with  one  another. 

In  expert  systems,  terminology  forms  a 
language  that  knowledge  sources  use  to 
communicate  with  one  another  and  it 
provides  the  building  blocks  from  which 
representations  for  other  kinds  of  knowledge 
are  constructed. 

•  Domain  Descriptive  Knowledge  describes  how 
the  domain  works.  It  can  be  thought  of  as 
the  "textbook  rudiments"  which  are  required 
before  one  can  turn  to  solving  problems.  In  a 
medical  domain,  this  would  be  primarily 
physiological  knowledge,  describing  causal 
relations  among  physiological  states  and 
symptoms  associated  with  diseases,  and  the 
effects  of  various  therapies.  In  another 
domain,  such  as  diagnosing  an  electronic 


circuit,  this  would  include  knowledge  of  the 
circuit  schematic  and  of  the  behavioral 
characteristics  of  the  various  components  that 
made  up  the  circuit. 

•  Problem  Solving  Knowledge  is  "how  to" 
knowledge.  It  supplies  knowledge  about  how 
tasks  (called  goals  in  our  system)  can  be 
accomplished.  This  is  where  knowledge  about 
how  to  perform  a  diagnosis  or  how  to 
administer  a  drug  belongs.  In  our 
representation,  problem  solving  knowledge  is 
organized  into  plans. 

We  shall  now  discuss  each  of  these  kinds  of  knowledge 
in  terms  of  some  specific  examples. 

2.1.  Terminology 

It  may  seem  odd  to  think  of  terminology  as  a  kind  of 
expertise,  but  before  one  can  begin  to  understand  a 
domain,  one  must  understand  the  terms  that  are  used  to 
describe  it.  During  the  first  stages  of  building  an  expert 
system,  when  system  builders  are  debriefing  an  expert 
about  a  domain,  much  of  the  time  they  spend  together  is 
concerned  with  understanding  the  terminology  of  the 
domain.  Since  terminology  provides  the  building  blocks 
out  of  which  an  expert  system  is  constructed,  it  plays  a 
pivotal  role  in  the  process  of  building  an  expert  system. 

Despite  its  importance,  few  expert  systems  have  any 
representation  for  the  definition  of  terminology.  The 
terminology  is  known  by  the  system  builder,  but  it  is  not 
explicitly  defined  within  the  expert  system  itself.  Instead, 
the  terms  used  by  the  system  implicitly  acquire  a 


Figure  1-2:  Architecture  lor  FFS  version  I 
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definition  based  on  how  other  knowledge  sources  in  the 
system  react  to  them  and  the  operational  mechanisms  for 
recognizing  instances  of  those  terms.  This  can  lead  to 
problems  both  in  explanation  and  maintenance  of  an 
exper*  system. 

To  illustrate  the  problem  of  implicit  terminology 
briefly,  suppose  we  define  a  sin- pie  rule  for  recognizing 
fever: 

If  patient ’8  temperature  j>  100 
then  conclude  fever. 

One  might  envision  an  explanation  facility  which,  if 
faced  with  the  need  to  define  the  concept  of  "fever," 
would  search  for  knowledge  of  ho.,  fever  was  recognized. 
It  could  then  use  this  rule  to  reply  that  fever  was  a 
condition  in  which  the  patient  had  a  temperature  greater 
than  100  degrees. 

In  fact,  that  would  confuse  an  operational  means  for 
recognizing  fever  with  a  definition  for  it.  To  see  that, 
suppose  now  that  an  expert  system  with  this  rule  and 
explanation  facility  is  actually  deployed  "in  the  field." 
Under  such  circumstances  it  may  yield  many  false 
positive  results  because  some  people  drink  hot  coffee 
before  their  temperature  is  taken.  Such  a  "bug"  may  be 
easily  fixed  by  modifying  the  rule:  t 

If  patient’s  temperature  >  100 

and  patient  has  not  recently  drunk 
coffee 

then  conclude  fever. 

What  will  this  do  to  the  explanation  facility?  Since  it 
depends  on  the  modified  rule,  it  will  now  include  the 
consumption  of  coffee  as  part  of  its  description  of  fever. 
Such  a  rule  can  also  make  system  maintenance  difficult. 
The  two  predicates  in  the  rule  serve  very  different  roles. 
The  first  is  concerned  with  establishing  criteria,  for 
recognizing  fever,  while  the  second  insures  the  validity  of 
the  temperature  measurement.  These  roles  are  not 
differentiated  in  the  rule  itself  and  in  more  complex 
systems  this  can  exacerbate  maintenance  problems.  This 
example  should  demonstrate  why,  for  the  sake  of 
explanation  and  maintenance,  terminology  should  be 
regarded  as  a  body  of  knowledge  distinct  from  problem 
solving  knowledge. 

To  provide  an  explicit  representation  for  terminology, 
we  have  been  using  a  knowledge  representation  system 
eased  on  the  ideas  pioneered  in  KL/ONE  [Brachman 
78,  Moser  83],  A  design  goal  behind  KL/ONE  based 
formalisms  is  to  provide  an  explicit  representation  for  the 
definition  of  terminology.  Our  representation,  like 
KL/ONE,  is  a  semantic  network-based  formalism. 
Concepts  (which  correspond  to  terms)  have  attibutes. 
corresponding  to  "slots"  in  frame-based  representations. 
Restrictions  may  be  placed  on  the  possible  fillers  of  the 
attribute  slots  for  a  particular  concept,  and  these 


restrictions,  together  with  the  attributes  of  a  concept 
contribute  to  the  definition  of  the  concept.  Concepts  are 
attributes  arranged  in  a  generalization  hierarchy  based  on 
subsumption  relations  among  them.  As  new  terms  are 
introduced,  an  automatic  classification  facility  [Sehmolze 
83]  determines  their  position  in  the  classification 
hierarchy,  based  on  their  definitions  and  the  definitions 
of  existing  terms  (see  [Neches,  et  ah  85]  for  an  example  of 
classification). 

In  the  domain  of  digitalis  therapy,  some  of  the  terms 
include  physiological  parameters  that  are  important  in 
this  domain  such  as  Increased  serum  calcium  and 
decreased  serum  potassium,  both  of  which  are 
specializations  of  observable  deviation.  Composite 
terms  may  be  built  from  other  terms,  such  as  the  goal 
compensate  digitalis  dose  for  digitalis 
senstlvitles  which  would  be  composed  from  the 
individual  terms  compensate,  digitalis,  dose  and 
sensitivity. 

As  we  will  see  below,  terminology  plays  an  important 
role  in  integrating  domain  descriptive  knowledge  and 
problem  solving  knowledge  during  the  process  of 
synthesizing  an  expert  system.  The  term  drug 
sensitivity  will  play  that  role  in  the  example  we 
present.  We  define  it  as: 

drug  sensitivity:  an  observable 

deviation  that  causes  something 
dangerous  that  Is  also  caused  by 
the  drug 

2.2.  Domain  Descriptive  Knowledge 

As  was  mentioned  above,  domain  descriptive  knowledge 
describes  how  the  domain  works.  It  is  typically  the  sort 
of  knowledge  that  one  finds  in  textbooks.  like 
definitions,  such  knowledge  may  he  represented  as 
declarative  assertions.  The  domain  descriptive  knowledge 
for  digitalis  therapy  included  facts  such  as: 

Increased  digitalis  causes  Increased 
automatlclty . 

Decreased  serum  potassium  causes  Increased 
automatlclty. 

Increased  serum  calcium  causes  Increased 
automatlclty . 

Increased  automatlclty  may  cause 
ventricular  fibrillation. 

Ventricular  fibrillation  is  a  dangerous 
condition . 

Decreased  serum  potassium  Is  an  observable 
deviation . 

Increased  serum  calcium  Is  an  observable 
deviation . 
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In  the  heart,  automaticity  refers  to  the  degree  to  which 
muscle  cells  are  likely  to  fire  spontaneously,  resulting  in 
abnormal  heart  rhythms.  Ventricular  fibrillation  is  one 
such  abnormal  rhythm  that  is  very  dangerous  because  it 
indicates  a  condition  in  which  the  neart,  has  ceased  to 
pump  blood. 

While  causal  unowledge  was  a  central  part  of  domain 
descriptive  knowledge  for  this  domain,  it  is  important  to 
emphasize  that  domain  descriptive  knowledge  is  not 
always  causal.  In  well  understood  domains,  such  as 
electronic  circuit  analysis,  the  domain  descriptive 
knowledge  might  be  a  mechanistic  description  of  the 
circuit,  while  in  poorly  understood  domains,  the  domain 
descriptive  knowledge  might  merely  consist  of 
probabilistic  associations  between  various  states  and  state 
changes  in  the  domain. 

What  is  not  part  of  domain  descriptive  knowledge  is 
knowledge  of  how  to  achieve  various  results:  how  to 
diagnose  a  patient  or  administer  a  drug.  That  knowledge 
is  part  of  problem  solving  knowledge. 

2.3,  Problem  Solving  Knowledge 

Problem  solving  knowledge  is  expressed  as  plans  that 
express  how  tasks  can  be  accomplished.  Plans  have 
capability  descriptions  which  describe  what  goals  they 
can  achieve.  Each  plan  also  has  a  method  which  is  a 
sequence  of  substeps  (which  may  themselves  include 
subgoals)  for  accomplishing  the  goal.  Capability 
descriptions  are  patterns  and  may  include  variables  that 
are  bound  when  the  capability  description  is  matched 
against  a  goal  to  be  achieved.  Plan  capabilities  and  goals 
are  represented  using  our  representat'  >  of  terminology 
described  above.  The  generalization  hierarchy  of  terms 
induces  a  generalization  hierarchy  of  plans  which  we  use 
for  finding  candidate  plans  for  achieving  goals  (see 
[Ncches,  et  al.  85]  for  details). 

Another  aspect  of  problem  solving  knowledge  is 
integration  knowledge.  This  is  knowledge  of  how  to 
combine  similar  results  from  multiple  knowledge  sources. 
We  will  describe  integration  knowledge  in  greater  detail 
in  the  context  of  an  txample  in  Section  3.1. 

In  EES,  we  have  tried  to  improve  on  the  very  specific, 
low-level  representation  of  problem  solving  knowledge 
that  exists  in  conventiona1  expert  system  frameworks.  As 
an  example,  one  of  the  situations  that  MYCIN  had  to 
dpal  with  was  the  possil  ility  of  being  able  to  figure  out 
the  genus  of  the  microorganism  that  was  infecting  the 
patient  but  not  being  able  to  figure  out  its  species.  In 
that  case,  MYCIN  just  assumed  that  the  species  of  the 
micro-organism  was  the  most  likely  one  for  the  particular 
genus,  a  reasonable  default  strategy.  What  w-s 
unreasonable  was  that  that  strategy  was  not  expressed  as 
a  single  rule,  but  instead  as  a  whole  collection  of  rules, 
eaeh  specific  to  one  of  the  genera  that  MYC'IN  knew 


about.  From  the  standpoint  of  explanation,  this  is  bad 
because  the  system  has  no  representation  of  the  general 
strategy  it  is  following;  hence  it  cannot  be  explained.  It 
is  also  bad  from  the  standpoint  of  modifying  the  expert 
system.  If  we  wanted  to  modify  MYCIN's  default 
strategy,  there  would  be  no  single  place  to  modify. 
Instead,  we  would  have  to  carefully  locate  by  hand  each 
of  the  rules  that  instantiated  that  strategy  and  modify 
them,  with  all  the  attendant  possibilities  for  making  a 
mistake. 

Plans  in  EES  are  usually  expressed  at  a  higher  level  of 
abstraction  than  the  rules  or  methods  of  conventional 
expert  systems.  Modification  or  extension  of  the  system’s 
problem  solving  knowledge  is  performed  at  the  level  of 
plans,  and  EES  is  responsible  for  propagating  the  results 
of  such  modifications  into  the  implementation  of  the 
expert  system.  This  eases  extension  and  modification. 
Such  abstraction  also  opens  up  the  possibility  of  re-using 
existing  problem  solving  knowledge  in  new  domains  and 
of  explaining  problem  solving  methods  at  a  general  leve' 
(see  [Swartout  83]). 

As  an  example,  consider  a  plan  for  compensating  for 
drug  sensitivities.  It  captures  the  common  sense  notion 
that  if  a  patient  has  a  sensitivity  to  a  particular  drug, 
then  the  dose  of  that  drug  should  be  reduced.  This  plan 
may  be  paraphrased  as  follows: 

Capability-description : 

Compensate  drug  dose  for  a  drug 
sensitivity 

Method:  If  the  drug  sensitivity  exists 

in  the  patient, 

then  reduce  the  drug  dose  because 
of  the  drug  sensitivity 

We  shall  now  show  how  this  plan,  together  with  domain 
descriptive  knowledge  and  terminology,  can  be  used  to 
implement  specific  rules  for  checking  for  specific 
sensitivities. 

3.  Combining  Different  Kinds  of  Expertise:  The 
Program  Writer 

The  program  writer  combines  those  different  kinds  of 
expertise  together  to  produce  a.  working  implementation 
of  an  expert  system.  The  program  writer  creates  the 
expert  system  using  goal  refinement  and  reformulation. 
Starting  with  a  high  level  goal  (such  as  "administer 
digitalis")  the  writer  searches  through  its  hierarchy  of 
plans  for  those  plans  whose  capability  descriptions 
subsume  (that  is,  match)  the  god.  It  selects  one  of  the 
matching  plans  and  instantiates  its  method.  This  results 
in  the  posting  of  subgoals  in  the  method  as  further  goals 
to  be  implemented,  and  the  program  writer  searches  for 
plans  to  implement  those  goals  in  turn.  The  writer 
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continues  in  this  fashion  until  all  goals  have  been 
implemented  in  terms  of  primitive  constructs2. 

In  the  event  that  no  plan  is  found  for  implementing  a 
goal,  the  program  writer  attempts  to  reformulate,  or 
transform,  the  goal  into  a  goal  or  set  of  goals  that  can  be 
implemented.  We  added  this  capability  to  provide 
several  benefits.  Maintenance  and  initial  system 
construction  would  be  easier  because  the  program  writer 
would  be  able  to  bridge  larger  gaps  between  plans  and 
goals,  and  knowledge  would  be  re-usable  in  a  larger  range 
of  situations.  We  identified  several  different  kinds  of 
reformulations  (see  [Neches,  et  al.  85]  for  a  detailed 
discussion).  As  an  example,  we  shall  consider 
reformulation  into  cases,  that  is,  reformulating  a  goal  of 
an  action  to  be  performed  over  a  set  of  objects  into  a  set 
of  goals  where  the  action  is  performed  on  individual 
elements  of  the  original  set  of  objects.  Such 
reformulation  takes  place  frequently  (but  implicitly)  in 
expert  systems. 

For  example,  in  many  diagnostic  systems  a  problem 
that  arises  is  to  determine  how  likely  it  is  that  a  patient 
has  some  disease  based  on  its  signs  and  symptoms.  In 
conventional  expert  systems  that  goal  is  usually  not 
explicitly  represented  in  the  system,  because  the  system 
designer  mentally  reformulates  it  while  constructing  the 
system.  What  does  appear  is  the  result  of  the 
reformulation:  a  set  of  goals  that  inquire  about  each  of 
the  symptoms  individually  and  a  combining  function  that 
deals  with  the  problem  of  how  to  collect  the  individual 
assessments  of  signs  and  symptoms  into  an  appropriate 
overall  assessment  for  the  disease.  EES  allows  us  to 
represent  the  original  goal,  the  reformulation,  and  the 
result  of  the  reformulation  explicitly.  As  we  will  show  in 
the  example  below,  goal  reformulation  plays  an 
important  role  in  integrating  domain  descriptive 
knowledge,  problem  solving  knowledge,  and  terminology. 

3.1.  An  Example 

To  make  the  operation  of  the  program  writer  clearer, 
let  us  consider  an  example  from  the  digitalis  domain 
concerned  with  the  problem  of  adjusting  the  patient's 

•  •  •  •  Q 

dose  to  account  for  digitalis  sensitivities.0 

At  some  point  in  the  program  writing  process  a  goal 
would  be  posted  to: 

Compensate  digitalis  dose  for 
digitalis  sensitivities 
This  is  a  goal  to  compensate  the  dose  for  all  I  he  digitalis 
sensitivities  that  are  known  to  the  system  (by  being 
represented  in  the  domain  descriptive  knowledge)  or  that 
ca  i  be  deduced  by  the  system.  The  program  w-itcr 


"These  include  constructs  lor  setting  a  variable,  conditional 
constructs,  and  I  he  like,  corresponding  to  LISP  constructs  such  as 
SETQ,  COND  and  so  forth. 


wouH  search  through  the  hierarchy  of  plans  to  find  all 
the  plans  whose  capability  description  subsumed  the  goal. 
In  this  case,  none  would  be  found.  The  plan  in  Section 
2.3  might  seem  to  be  directly  applicable  to  this  goal 
because  the  pattern  in  its  capability  description  looks 
very  similar  to  the  goal.  In  fact,  we  must  do  some  goal 
reformulation  before  that  plan  can  be  applied.  The 
problem  is  that  the  goal  requires  compensating  for  all  the 
sensitivities,  but  the  plan  can  only  compensate  for  an 
individual  sensitivity.  Thus,  we  must  reformulate  the 
goal  into  a  set  of  goals  over  individual  sensitivities  before 
program  writing  may  proceed.  The  system  does  that  by 
making  use  of  its  terminological  and  domain  descriptive 
knowledge.  Consulting  the  terminological  knowledge  for 
the  definition  of  sensitivity  (given  in  Section  2.1),  it  finds 
that  a  drug  sensitivity  is  an  observable  deviation  that 
causes  something  dangerous  to  happen  that  is  also  caused 
by  the  drug.  Specializing  that  term  to  digitalis 
sensitivity,  and  using  the  domain  facts  (given  in  Section 
2.2),  the  classifier  finds  two  individual  observable 
deviations  that  are  digitalis  sensitivities:  increased 
serum  calcium  and  decreased  serum  potassium. 
The  writer  then  reformulates  the  orginal  goal  over 
digitalis  sensitivities  into  two  goals  over  individual 
digitalis  sensitivities: 

Compensate  digitalis  dose  for 
increased  serum  calcium 

Compensate  digitalis  dose  for 
decreased  serum  potassium 

The  method  of  Section  2.3  can  then  he  applied  to  each  of 
these  two  goals.  When  that  is  done,  I  lie  methods  are 
instantiated  to  produce  two  code  fragments: 

If  increased  serum  calcium  exists  in 
the  patient 

then  reduce  the  digitalis  dose 
because  of  increased  serum 
calcium 

If  decreased  serum  potassium  exists 
in  the  patient 

then  reduce  the  digitalis  dose 
because  of  decreased  serum 
potassium 


®The  example  we  present  here  was  actually  implemented  using  tile 
XTLA1.N  framework  [Swartout  83],  which  also  produced  the  sample 
explanations  that,  appear  in  Figure  3-1  XPLAIN  was  a  precursor  to 
the  EES  framework  and  did  not  provide  nil  explicit  representation 
for  terminology.  We  feel  that  version  I  of  EES  provides  a  cleaner 
conceptualization  of  the  program  writing  process,  and  we  have 
implemented  two  demonstration-sized  systems  in  F.ES  version 
I,  Unfortunately,  certain  limitations  on  the  expressivity  of  the 
knowledge  representation  system  we  used  in  EES  version  I  would 
have  made  it  difficult  to  express  some  of  tile  terminology  used  in  this 
example,  particularly  sensitivity.  We  are  usii  g  an  extended 
representation  in  EES  version  II,  which  wi  describe  in  Section  I  We 
are  presenting  this  example  from  the  perspective  of  EES,  rather  than 
XPLAIN,  because  we  feel  it  provides  a  more  iindersi amiable  account 
of  the  program  writing  process. 
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The  problems  of  determining  whether  increased  serum 
calcium  or  decreased  serum  potassium  exist  and  of 
reducing  the  dose  are  then  posted  as  new  goals  for  the 
system  to  implement. 

An  additional  problem  that  confronts  the  program 
writer  is  to  reason  about  how  to  integrate  the  two  code 
fragments.  Since  both  fragments  cause  the  dose  to  be 
reduced  the  issue  is  to  determine  what  should  be  done  if 
both  sensitivities  co-occur.  This  is  an  example  of  what 
we  refer  to  as  an  integration  problem,  that  is,  how  to 
integrate  similar  conclusions  reached  bv  multiple 
knowledge  sources.  In  most  expert  systems,  this  kind  of 
problem  is  handled  by  some  implicit  mechanism  built 
into  the  system’s  interpreter  (like  the  certainty  factor 
mechanism  in  MYCIN).  Because  the  mechanism  is  built 
into  the  interpreter,  it  is  convenient  to  use.  but  also 
subject  to  abuse,  since  the  assumptions  that  underlie  it 
are  never  explicitly  checked.  Also,  usually  only  one 
mechanism  is  provided,  so  system  builders  will  attempt 
to  apply  that  mechanism  to  as  many  situations  as 
possible,  even  if  its  appropriateness  is  questionable.  We 
argue  that  integration  problems  should  be  reasoned  about 
explicitly  by  the  program  writer  while  the  expert  system 
is  being  created.  Taking  that  approach  allows  the 
assumptions  that  underlie  an  integration  technique  to  be 
checked.  Also,  several  techniques  may  be  represented, 
allowing  the  program  writer  to  select  the  most 
appropriate  one  for  the  problem  at  hand.  In  this 
particular  case,  the  system  uses  a  piece  of  integration 
knowledge  that  tells  it  that  the  two  program  fragments 
can  be  chained  together  (that  is,  connect  the  outputs  of 
one  to  the  inputs  of  the  next)  if  the  causal  relations  that, 
the  fragments  are  based  on  are  independent  and  additive 
(see  [Swartout  83]  for  further  details). 

This  entire  process  was  recorded  so  that  it  could  later 
be  used  in  giving  much  richer  explanations  that  reflected 
the  causal  underpinnings  that  the  expert  system  was 
based  on,  as  shown  in  Figure  3-1.  The  critical  difference 
between  that  explanation  and  the  one  in  Figure  1-1  are 
the  second  and  third  sentences  of  the  first  explanation 
which  provide  a  causal  reason  for  checking  serum 
calcium.  This  explanation  was  produced  by  paraphasing 
the  causal  relations  that  matched  the  domain  rationale  of 
the  plan  used  to  generate  this  code  for  checking  serum 
calcium 

4.  Capturing  Intent  and  the  Roots  of  Problem 
Solving  Knowledge 

While  the  EES  version  I  framework  allows  us  to  capture 
the  knowledge  needed  to  explain  the  rationale  that 
underlies  an  expert  system,  two  issues  remain.  First. 
EES  version  I  still  does  not  provide  the  capability  to 
represent  the  intent  behind  a  goal  (our  third  question 
type).  For  example,  it  is  not  possible  to  answer  the 
question:  "What  does  it  mean  to  administer  digitalis?" 
Problem  solving  knowledge  of  lioir  to  give  digitalis  can  be 


retrieved,  but  it  is  not  represented  anywhere  that  the 
problem  of  digitalis  administration  is  a  problem  of 
finding  a  dosage  level  of  digitalis  that  produces 

Please  enter  the  value  of  serum  calcium:  why? 

The  system  is  anticipating  digitalis  toxicity. 
Increased  serum  calcium  causes  increased 
automaticity,  which  may  cause  a  change  to 
ventricular  fibrillation.  Increased  digitalis  also 
causes  increased  automaticity.  Thus,  if  the 
system  observes  increased  serum  calcium,  it 
reduces  the  dose  of  digitalis  due  to  increased 
serum  calcium. 


Please  enter  the  value  of  serum  calcium:  9 

Please  enter  the  value  of  serum  potassium: 
why? 

(The  system  produces  a  shortened  explanation, 
reflecting  the  fact  that  it  has  already  explained 
several  of  the  causal  relationships  in  the  previous 
explanation.  Also ,  since  the  system  remembers  that 
it  has  already  told  the  user  about  serum  calcium,  and 
because  it  knows  that  the  same  plan  was  used  to 
generate  the  code  for  both  ser\nn  potassium  and 
serum  calcium,  it  suggests  the  analogy  between  the 
two  here.) 

The  system  is  anticipating  digitalis  toxicity. 
Decreased  serum  potassium  also  causes 
increased  automaticity.  Thus,  (as  with 
increased  serum  calcium)  if  the  system  observes 
decreased  serum  potassium,  it  reduces  the  dose 
of  digitalis  due  to  decreased  serum  potassium. 

Please  enter  the  value  of  serum  potassium:  3.7 

Figure  3-1:  A  Causal  Explanation  of  Why  Serum 
Calcium  and  Potassium  arc  Checked 

satisfactory  therapeutic  results  subject  to  the  constraint 
of  avoiding  (or  minimizing)  toxic  effects. 

As  another  example,  consider  an  expert  system  we  built 
using  EES  for  diagnosing  faults  in  space  telemetry 
systems.  This  system  had  several  methods  for  diagnosis, 
which  we  have  hand-paraphrased  in  Figure  PI.  Such  a, 
display  is  really  the  only  response  that  system  could  have 
given  to  the  question:  "What  does  it  mean  to  diagnose  a 
decomposiblc  system?"  With  some  effort  a  user  might  be 
able  to  examine  Figure  d-1  and  figure  out  liow  the  system 
pei  forms  a  diagnosis,  but  it  would  be  considerably  harder 
for  him  to  figure  out  what,  a  diagnosis  amounts  to.  What 
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is  needed  is  an  explicit  representation  of  the  intent 
behind  a  goal  that  would  allow  us  to  answer:  "To 
diagnose  a  decomposable  system  means  to  find  a 
primitive  subcomponent  of  the  system  that  is  faulty." 

To  diagnose  a  decomposable  system. 

If  there  is  a  fault  in  the  system, 
then  locate  the  cause  of  the  fault 
within  the  system 

To  diagnose  a  primitive  system, 

If  the  system  is  faulty, 
then  conclude  it  is  the  diagnosis 

To  locate  the  cause  of  a  fault  within  a 
system  which  is  loosely-coupled, 

Diagnose  the  subcomponents  of  the  system 

To  locate  the  cause  of  a  fault  within  a 
system  which  is  tightly-coupled, 

Locate  the  cause  of  the  fault  along  the 
signal-path  beginning  at  the 
system-input  and  ending  at  the 
system-output. 

To  locate  the  cause  of  a  fault  beginning  at 
systeml  and  ending  at  system2, 

If  systeml  is  faulty 
then  diagnose  systeml 
else  locate  the  cause  of  the  fault  along 
the  signal-path  beginning  at  the  system 
that  systeml  outputs  to  and  ending  at 
system2 . 

Figure  4-1:  Methods  as  an  Inadequate  Explanation  of 
the  Goal  of  Diagnosis 


Second,  we  want  to  understand  (and  be  able  to  explain) 
the  source  of  problem  sohing  knowledge.  While  the 
implementation  of  EES  described  in  the  preceding 
sections  allows  a  system  builder  to  represent  problem 
solving  expertise  at  a  more  abstract  level  than  is  possible 
in  most  expert  system  frameworks,  it  is  clear  that  even 
that  expertise  is  compiled  from  some  still  more  basic 
knowledge.  The  question  is:  w  hat  is  that  more  basic 
representation,  and  how  does  that  compilation  take 
place?  By  understanding  the  "roots"  of  problem  solving 
knowledge,  we  hope  to  be  able  to  provide  better 
explanations  of  how  the  problem  solving  knowledge 
works. 

Ultimately,  we  would  like  to  be  able  to  explain  the 
compromises  and  approximations  that  were  involved  in 
an  implementation  and  the  way  in  which  conflicting 
prefences  were  resolved.  We  are  still  a  long  way  from 
that  goal,  but  we  have  developed  a  basic  framework 
within  which  we  can  begin  to  address  these  issues. 

The  approach  we  have  adopted  to  address  these 
problems  is  to  represent  goal  intent  in  terms  of  a  small 
number  of  primitive  actions  and  then  to  mechanically 
derive  methods  for  achieving  those  goals  by  transforming 
definitions  and  axioms  in  the  domain  descriptive 
knowledge  base.  As  shown  in  Figure  4-2  this  modifies 
the  EES  architecture  by  adding  a  new  transformational 
component  that  derives  EES  plans  by  transformation.  As 
before,  these  plans  will  then  used  by  the  program  writer 
to  create  an  expert  system. 

4.1.  Capturing  Intent:  Primitive  Actions 

In  general,  expert  systems  lack  any  specification  of 
what  their  goals  mean  that  would  make  it  possible  to 


Figure  4-2:  Architecture  for  EES  version  11 
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answer  questions  about  intent.  Thus,  most  expert  system 
goals  are  defined  like  their  terminology:  implicitly.  Goals 
acquire  their  meaning  based  solely  on  the  methods  that 
implement  them.  What  we  would  like  is  a  separate 
definition  for  the  intent  of  goals.  This  would  essentially 
serve  as  a  specification  for  what  achieving  the  goal 
entails.  It  would  provide  a  basis  for  the  meaning  of  goals 
that  would  allow  the  system  to  explain  what  they  mean 
and  provide  an  independent  criterion  that  could  be  used 
in  deciding  whether  or  not  the  system  understood  goals 
in  the  sa.me  way  as  the  user. 

The  problem  of  representing  intent  is  really  a  problem 
of  defining  terminology,  but  specialized  to  defining  verbs 
and  verb  clauses.  To  provide  a  basis  for  these 
definitions,  we  have  attempted  to  uncover  a  core  set  of 
primitive  actions.  Higher-level  goals  arc  defined  and 
explained  in  terms  of  these  primitive  actions.  These 
primitive  actions  are  objects  that  cannot  be  further 
explained,  so  in  defining  a  set  of  primitive  actions,  it  is 
important  to  select  actions  that  almost  every  user  can  be 
assumed  to  understand  readily. 

This  approach  has  its  roots  in  Pappus'  discussion  of 
solving  mathematical  problems  which  was  analyzed  at 
great  length  by  Polya  [Polya  71].  Pappus  based  his  work 
on  the  observation  that  all  mathematical  problems  could 
uc  reduced  to  the  primitive  actions  of  finding  and 
proving  and  that  problem  solving  was  a  matter  of 
planning  subgoals  (also  expressed  in  terms  of  those 
primitive  actions)  through  which  those  actions  could  be 
achieved. 

We  have  been  exploring  this  approach  in  the  context  of 
e  system  for  diagnosing  digital  circuits.  So  far,  we  have 
identified  am'  extensively  analyz'd  two  mumin'/e  actions 
in  this  doma  1. 

1.  determine- whether :  establishes  the  truth 
of  a  given  assertion 

2.  find:  finds  an  object  that  matches  a  given 
description 

Given  these  primitive  actions,  capturing  the  intent 
behind  a  domain  level  goal  then  involves  linking  that 
goal  to  its  definition  in  terms  of  primitive  actions.  Thus, 
wc  would  define  the  goal: 

"diagnose  decomposable  digital  system  ,s" 

as  the  problem: 

"finding  a  primitive  system  p  such  I  lull  p  is  a 
subcomponent  of  s  and  p  is  faulty" 

We  acknowledge  that  the  primitive  actions  find  and 
det ermine-whet. her  may  not  accommodate  other  areas  of 
problem  solving  as  readily  as  mathematics  or  circuit 
diagnosis.  Thus,  wc  believe  that  our  set  of  primitive 
actions  will  grow  somewhat  as  we  gain  more  experience 


with  this  approach.  For  example,  a  "canonical" 
representation  of  the  goal  of  administering  digitalis  might 
be  expressed  as  finding  a  digitalis  dosage  level  which 
satisfies  a  set  of  assertional  constraints:  but  what  are  we 
to  make  of  goals  involving  the  verb  "compensate?"  Such 
an  action  might  be  expressed  in  terms  of  achieving  (or 
avoiding)  a  patient  state;  and,  certainly,  achieving  find 
avoiding  are  actions  that  are  often  found  in  problem 
solving.  However,  we  are  still  in  the  process  of 
investigating  how  such  actions  may  best  be  represented 
as  primitive  actions. 

4.2.  The  Roots  of  Problem  Solving  Knowledge 

While  our  study  of  primitive  actions  is  still  an  area  of 
active  research,  it  has  already  yielded  valuable 
consequences  for  our  representation  of  problem  solving 
knowledge.  We  shall  discuss  two  of  these  consequences  in 
greater  detail: 

1.  The  mechanical  derivation  of  plans  from  the 
declarative  representation  of  definitions  and 
domain  descriptive  knowledge. 

2.  The  introduction  of  domain-independent 
problem  solving  knowledge  which  may  be 
integrated  with  domain-specific  plans. 

Mechanically  derived  plans 

As  has  already  been  observed,  domain  descriptive 
knowledge  can  be  likened  to  textbook  knowledge.  Often, 
however,  the  major  problem  with  such  knowledge  is  that 
it  offers  little  help  regarding  how  it  should  be  used.  For 
example,  what  is  the  use  of  knowing  that  decreased 
serum  potassium  causes  increased  autoniaticity? 

In  Section  3  that  question  was  answered  in  terms  of  the 
behavior  of  an  automatic  programmer  which  generated 
expert  system  code.  However,  the  knowledge  was  used  to 
tie  together  specific  units  of  problem  solving  knowledge 
which  had  no  direct  connection  back  to  the  definitions 
and  domain  descriptive  knowledge.  We  wished  to 
consider  an  alternative  situation  in  which  plans  could  be 
derived  from  declarative  knowledge,  rather  than  entered 
independently. 

Our  approach  is  to  construct  a  set  of  irmsforinalioiis, 
specific  to  each  of  the  primitive  actions,  that  can  be 
applied  to  facts  in  the  domain  descriptive  knowledge  to 
create  problem  solving  knowledge  in  the  form  of'  plans. 
For  a  very  simple  example,  iT  the  knowledge  base 
contains  the  assertion  that  "  \  exists  if  and  only  if  If 
exists,"  then  it  is  possible  to  derive  a  plan  that 
determines  whether  1.1  exists  by  checking  for  the  presence 
of  A.  Since  the  implication  is  two-way,  it  is  also  possible 
to  derive  another  plan  for  checking  for  the  existence  of  A 
by  checking  for  B*. 

^Of  fours*',  care  musl  lx*  taken  in  interpret  mg  ’Midi  plan’s  to  avoid 
circular  reasoning  chains. 


Considerably  more  complex  examples  can  be  handled 
by  our  transformations.  For  example,  in  constructing  the 
digital  circuit  diagnoser,  the  domain  descriptive  model 
included  the  circuit  schematic,  detailing  the 
interconnections  among  devices  in  the  circuit,  and 
descriptions  of  the  functional  behavior  of  those  devices. 
Thus,  the  domain  descriptive  knowledge  included  the 
facts  such  as: 

multiplier  M2  is  connected  to  adder 
A1 

the  expected  output  of  an  adder  is 
equal  to  the  sum  of  its  expected 
inputs 

These  facts,  and  others  describing  the  remaining  topology 
of  the  circuit  and  the  behavior  of  other  devices,  it  was 
possible  to  mechanically  derive  a  set  of  procedures  for 
finding  the  expected  signal  value  along  any  connector  in 
the  circuit,  given  a  particular  set  of  input  values. 

Encouraged  by  our  results  in  the  domain  of  diagnosis  of 
digital  circuits,  we  have  begun  a  re-implementation  of 
portions  of  the  digitalis  advisor  using  this  framework. 
An  interesting  observation  is  emerging  from  our  initial 
work  in  the  digitalis  domain,  which  is  that  different 
kinds  of  primitive  actions  seem  to  involve 
transformations  over  different  kinds  of  domain 
descriptive  knowledge.  The  transformations  for  deriving 
plans  for  performing  find  actions  involve  sets  and 
instances,  determine-whether  involves  implications  and 
types,  and  achieve  and  avoid  involve  states,  state 
transitions,  and  causality. 

We  feci  that  the  tranformation  mechanisms  capture 
some  very  general  kinds  of  problem  solving  knowledge 
which  is  transformed  into  more  specific  plans  given  the 
domain  particulars  as  expressed  in  the  domain  descriptive 
knowledge.  If  the  number  of  primitive  actions  remains 
relatively  small,  the  fact  that  that  general  problem 
solving  knowledge  is  only  captured  implicitly  in  the 
transformation  system  will  not  be  too  much  of  a  problem 
because  it  will  be  feasible  to  build  that  knowledge  into 
the  explanation  routines  as  well.  On  the  other  hand,  if 
the  number  of  primitive  actions  grows,  such  an  approach 
will  not  be  feasible,  and  it  will  be  necessary  to  find  more 
declarative  means  for  defining  the  problems  solving 
knowledge. 

Weak  methods 

Because  the  objects  affected  by  primitive  actions  are 
very  simply  defined,  one  may  conceive  of  plans  whose 
capability  descriptions  involve  such  actions  in  a  domain- 
independent  manner.  For  example,  performing  the 
detennine-whether  action  on  an  assertion,  which  is  a 
conjunction  does  not  involve  any  domain-specific 
knowledge.  The  method  for  such  a  plan  would  be  based 
on  subgoals  which  perform  the  detennine-whether  action 
on  each  of  the  eonjunets.  Thus,  one  may  develop  a  set  of 


plans  for  the  detennine-whether  action  corresponding  to 
the  different  syntactic  possibilities  of  an  assertion  whose 
truth  is  being  determined. 

Such  plans  are  generalizations  of  the  domain-specific 
plans  which  have  already  been  considered.  Any  goal 
which  matches  the  capability  description  of  a  doms  in¬ 
specific  plan  may  also  match  the  capability  description  of 
one  or  more  of  these  more  general  plans.  We  call  the 
domain-independent  plans  weak  methods ;  and  they  play 
two  important  roles: 

1.  They  provide  an  operational  semantics  for 
primitive  actions.  Thus,  the  weak  methods 
provide  knowledge  of  how  to  determine- 
whether  ov>  assertion  holds,  regardless  of  that 
assertion’s  domain-specific  content. 

2.  This  semantic  base  allows  the  problem  solver 
to  apply  "first  principles"  to  any  goal  which 
cannot  be  accommodated  by  domain-specific 
knowledge.  If  no  domain-specific  plan  has  an 
appropriate  capability  description,  one  can 
always  resort  to  the  weak  methods. 

Our  view  of  "weak  methods"  differs  from  the  more 
common  use  of  the  term,  such  as  may  be  found  in  [Laird 
83].  This  more  familiar  usage  is  based  on  a  view  of 
problem  solving  as  application  of  operators  within  a 
problem  space.  From  this  approach,  weak  methods  se»ve 
as  decision  procedures  for  the  selection  of  the  most 
appropriate  operators.  In  our  approach  a  problem  is 
represented  not  by  a  problem  space  but  by  a  primitive 
action  which  must  be  achieved,  and  the  solution  of  a 
problem  is  the  planning  of  subsidiary  actions  which  will 
obtain  this  result.  Weak  methods,  then,  may  be  regarded 
as  the  most  general  plans  for  performing  primitive 
actions,  because  they  deal  with  the  actions  themselves, 
rather  than  with  any  problem-dependent  objects  affected 
by  the  actions.  Nevertheless,  our  approach  does  share 
some  common  ground  with  the  problem  space  view  of 
weak  methods.  For  example,  the  general e-and-test 
method  of  [Laird  83]  is  represented  in  our  system  as  a 
weak  method  for  dealing  with  the  find  action.  Indeed, 
we  conjecture  that  every  weak  method  analyzed  in  [Laird 
83]  may  be  represented  as  the  realization  of  some 
primitive  action. 

We  recognize  that  due  to  limitations  in  our  technology 
for  automatically  deriving  plans,  there  may  be  desired 
expert  behavior  that  we  cannot  achieve  relying  solely  on 
mechanically  derived  plans  and  weak  methods.  For  that 
reason,  we  have  included  the  possibility  of  manually 
defining  strong  (domain-specific)  methods  and  entering 
them  directly  into  the  plan  hierarchy  (see  Figure  1-2). 
Like  other  plans,  these  are  organized  in  the  hierarchy 
based  on  their  capability  descriptions.  The  program 
writer  retrieves  and  instantiates  them  just  like  other 
plans.  The  difference  is  that  these  plans  are  less 
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explainable.  They  cannot  be  related  them  back  to 
underlying  domain  descriptive  knowledge,  as  with  the 
mechanically  derived  plans.  We  feel  that  one  of  the 
strengths  of  our  appr  ch  is  that  it  permits  us  to 
approach  explanation  in  this  incremental  fashion.  We 
can  derive  some  plans  mechanically,  but  our  inability  to 
derive  other  plans  does  not  preclude  constructing  a 
system.  Thus,  it  is  possible  to  build  a  system  even  if  it 
cannot  be  fully  explained,  but  the  explanations  will 
improve  with  our  increased  understanding  of  the  relation 
between  domain  descriptive  and  problem  solving 
knowledge. 

As  of  this  writing,  we  have  constructed  sets  of 
transformations  for  the  find  and  determine-whelher 
primitive  actions  and  have  used  them  to  derive  plans  in 
the  domain  of  digital  circuit  diagnosis.  We  have  also 
defined  approximately  25  weak  methods  for  the  find  and 
determine-whether  actions.  We  have  constructed  an 
interpreter  that  can  execute  these  plans  to  produce 
problem  solving  behavior.  We  arc  currently  in  the 
process  of  integrating  the  transformational  system  with 
our  program  writer.  We  are  also  exploring  the 
applicability  of  this  approach  in  other  domains,  such  as 
digitalis  therapy.  We  feel  that  these  explorations  will 
lead  to  a  better  understanding  of  the  kinds  of  primitive 
actions  that  are  appropriate  to  model, 

5.  Further  Requirements  for  Explanation 

In  the  preceding  sections,  we  have  argued  that  if  a 
system  is  to  explain  its  reasoning,  it  is  necessary  to  model 
additional  kinds  of  expertise  that  are  normally  left  out  of 
a  performanee-oriented  expert  system.  Wc  have 
described  the  nature  of  those  additional  kinds  of  expertise 
and  our  approach  to  capturing  that  expertise  in  an  expert 
system.  In  this  section,  we  return  to  the  issue  of 
explanation,  and  describe  some  additional  constraints 
that  explanation  imposes  on  the  way  that  knowledge  is 
structured  and  represented. 

The  Need  for  a  Continuum  of  Abstraction 

From  the  preceding  sections  it  might  appear  that  we 
feel  that  a  more  abstract  or  "deeper"  representation  for 
expertise  is  always  best.  In  fact,  it  is  important  to  have  t 
variety  of  different  levels  of  abstraction  available  for 
explanation  and  to  select  among  them  based  on  the 
experience  and  interests  of  the  user.  For  example,  we 
have  argued  that  an  explanation  that  is  just  based  on 
performance-level  expertise  is  probably  not  appropriate 
for  many  users  because  it  leaves  out  the  rationale  that 
justifies  it.  But  such  an  explanation  may  be  very 
appropriate  for  an  expert  user  who  fully  understands  the 
rationale  and  is  just,  interested  in  assuring ,  himself  that 
the  system  will  take  the  correct  actions  in  a  particular  set 
of  eiivu  ■  tnnees. 

Thus,  it  is  not  just  a  matter  of  reasoning  with  the 
compiled,  performance-level  knowledge  lor  efficiency 


while  falling  back  to  the  deeper  knowledge  for 
explanation.  Instead,  the  expla  lation  routines  must,  be 
capable  of  selecting  among  knowledge  expressed  at 
different  levels  of  compilation  and  producinj  explanations 
from  that  knowledge.  That  implies  that,  the  different 
levels  of  compilation  and  the  correspondences  among 
them  must  exist  together.  In  comparing  our  approach  of 
compiling  an  expert  system  from  deeper  level  expertise 
with  an  approach  in  which  the  compilation  step  is 
skipped  and  the  deep  knowledge  is  directly  interpreted, 
we  often  we  often  argue  for  our  approach  on  the  basis  of 
increased  efficiency  but  the  explanation  requirement  for 
simultaneously  existing  multiple  levels  of  compilation  also 
argues  for  our  approach. 

Different  Information  for  Different  Users 

Paris  [Paris  87]  has  observed  that  explanations  for 
novices  and  explanations  for  more  experienced  people 
differ  fundamentally  in  the  kind  of  information  that  is 
conveyed,  not  just  in  the  level  of  detail,  as  had  been 
previously  thought  [Wallis  82].  Paris  studied  descriptions 
of  various  devices  in  both  junior  and  adult  encyclopedias. 
She  discovered  that  the  entries  in  junioi  encyclopedias 
tended  to  emphasize  the  function  of  the  device  and  the 
functional  relations  of  its  parts,  while  adult  encyclopedias 
emphasized  component/subcomponent  relationships  and 
the  physical  structure  of  the  device.  Presumably, 
functional  information  is  left  out  of  the  adult  entries 
because  adults  already  know  that  information.  While 
Paris  has  identified  this  phenomenon  she  has  not  yet 
provided  a  theory  that  will  explain  what  kinds  of 
knowledge  will  be  included  and  what  kinds  will  be  left 
out. 

We  feel  that  the  compilation  proee.ss  in  EES  may  be  the 
beginnings  of  such  a  theory,  at  least  for  expert  systems. 
Explanations  presented  at  different  levels  of  compilation 
differ  fundamentally  in  the  kinds  of  knowledge  that  is 
presented.  Explanations  produced  from  uncompiled 
knowledge  will  include  definitions  of  terminology,  causal 
relations,  end  abstract  descriptions  of  problem  solving 
strategies.  That  is  all  information  that  will  probably  be 
familiar  to  an  expert,  and  hence  not  necessary  to  explain 
to  him,  but,  that  a  novice  is  likely  to  want  to  know. 
Explanations  produced  from  compiled  knowledge  will  be 
most  appropriate  for  experts  because  they  will  not 
contain  the  motivating  knowledge  that  experts  would 
already  know. 

6.  Conclusion 

We  have  argued  that  "planations  produced  solely  from 
performance  expertise  .ill  be  inadequate,  because  such 
explanations  leave  out  the  rat:onale  upon  which  the 
expert  system's  behavior  is  based.  We  described  three 
kinds  of  expertise  that  must,  be  modelled  to  provide 
adequate  explanations:  knowledge  of  terminology,  domain 
descriptive  knowledge,  and  abstract  problem  solving 
knowledge.  In  conventional  expert  systems,  these 
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different  kinds  of  knowledge  are  confounded  together  in  a 
relatively  low  level  representation,  such  as  rules,  if  they 
are  represented  at  all.  Separating  the  different  kinds  of 
knowledge  improves  explanations  because  it  allows  the 
explanation  routines  to  select  just  the  right  information 
to  present  to  answer  a  user's  questions,  free  of 
confounding  factors.  The  separation  also  makes  the 
system  easier  to  maintain  because  it  increases  its 
modularity. 

We  presented  a  framework  for  expert  system 
construction  that  employs  an  automatic  programmer  to 
integrate  the  different  kinds  of  knowledge  together  to 
produce  a  working  expert  system.  The  program  writer 
leaves  behind  a  record  of  the  design  decisions  that 
underlie  the  expert  system.  These  "mental 
breadcrumbs"  are  used  by  explanation  routines  to 
explain  the  workings  of  an  expert  system  in  terms  of 
basic  domain  knowledge. 

We  presented  our  current  research,  which  is  concerned 
with  providing  a  representation  for  the  intent  of  goals  in 
terms  of  primitive  actions,  and  with  providing  a  better 
understanding  of  the  relationship  between  domain 
descriptive  knowledge  and  problem  solving  knowledge. 
Finally,  we  presented  some  additional  requirements  that 
explanation  imposes,  and  showed  how  our  approach 
supports  them.  Clearly,  much  remains  to  be  done  before 
expert  systems  will  be  able  to  explain  themselves  as 
lucidly  as  human  experts.  Nevertheless,  our  results  so  far 
encourage  us  in  our  future  undertakings. 
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Abstract 


The  lengthening  lifetimes  of  intelligent  systems,  and 
the  desire  to  share  or  re-use  knowledge  bases,  has  created 
within  the  AI  community  the  need  for  application- 
independent  knowledge  representation  systems.  The 
Loom  system  being  developed  at  IS!  represents  the  latest 
in  a  series  of  "classification-based"  knowledge  represen¬ 
tation  systems  developed  to  meet  this  need.1  In  Loom, 
the  traditional  single-classifier  architecture  is  replaced  by 
one  containing  a  collection  of  elassifiers  which  exhibit  in¬ 
creasingly  powerful  inference  capabilities.  This  paper 
describes  the  knowledge  representation  language 
developed  for  the  Loom  system. 

1.  Introduction 

Loom2  represents  a  recent  entry  into  the  KL-ONE 
[Brachman  and  Schmolze  85]  familj  of  knowledge 
representation  systems.  Loom  directly  succeeds  the  NIKL 
system  [Schmolze  and  Lipkis  83,  Moser  83]  developed 
jointly  by  ISI  and  BBN.  During  NIKL’s  lifetime,  the 
NIKL  user  community  produced  a  rather  extensive  list  of 
extensions  that  they  wished  to  see  in  future  versions  of 
NIKL  [Kaczmarek  86],  Loom’s  designers  determined  that 
these  needs  eould  best  be  achieved  by  redesigning  and 
reimplementing  NIKL.  The  result  is  a  more  flexible  ar¬ 
chitecture  which  preserves  the  strengths  of  the  original 
NIKL  while  admitting  some  new  and  powerful  forms  of 
reasoning. 


1  This  research  is  suppor  d  by  the  Defense  Advanced  Research 
Projects  Agency  under  Contract  MDA903-81-C-0335  Views  and 
conclusions  contained  in  this  paper  a-e  the  authors’  and  should  not 
be  interpreted  as  representing  the  official  opinion  of  DARPA,  the 
U.S.  Government,  or  any  person  or  agency  connected  with  them. 

2Loom:  “A  frame  ...  for  interlacing  ...  sets  of  threads  or  yarns  to 
form  a  cloth  11  Webster’s. 


Loom’s  architecture  strongly  refleets  the  view  that 
the  variety  of  inferences  provided  by  a  comprehensive 
knowledge  representation  system  can  best  be  performed 
by  a  well-integrated  collection  of  specialized  reasoning 
components,  rather  than  by  a  single,  general-purpose 
reasoner.  KL-ONE-style  systems  (e.g.,  KL-ONE,  KL- 
TWO  [Vilain  85],  KRYPTON  [braehman,  Fikes,  and 
Levesque  83],  and  BACK  [von  Luck  87])  have  tradition¬ 
ally  divided  their  knowledge  space  into  two  partitions, 
ealled  the  "Terminological  Box"  and  the  "Assertional 
Box",  and  have  utilized  two  distinct  reasoners 
(terminological  and  assertional)  to  carry  out  their  in¬ 
ferences.  Loom’s  principle  architectural  contribution  is 
to  introduce  two  additional  partitions  (the  "Universal 
Box"  and  the  "Default  Box"),  each  having  its  own  as¬ 
sociated  reasoning  component. 

Complementing  this  increase  in  the  number  of 
domain-independent  reasoners  embedded  in  the  system 
architecture  is  a  growing  library  of  domain-specific, 
"narrow-coverage"  reasoners.  Currently  these  include 
facilities  for  computing  or  reasoning  about  transitive  rela¬ 
tions,  sets,  intervals,  and  some  elementary  forms  of 
numerie  reasoning.  These  reasoners  can  be  invoked  in¬ 
dependently,  or  called  by  the  broad-coverage  reasoners. 

The  trick  in  integrating  this  collection  of  reasoners 
is  to  develop  a  language  for  expressing  knowledge  which 
emphasizes  the  overall  coherence  and  uniformity  of  the 
knowledge  structures.  Loom  accomplishes  this  goal  by 
building  on  the  "concept-centered"  view  of  knowledge 
employed  in  KL-ONE  (and  NIKL).  Accordingly,  all 
universal  and  default  knowledge  is  attached  to  specific 
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concepts.  In  a  similar  vein,  sets,  intervals,  and  relations 
(including  transitive  and  composite  relations)  are  all  real¬ 
ized  as  specialized  forms  of  concepts  —  their  definitions 
share  a  uniform  syntax,  and  each  of  them  has  its  own 
sublattice  within  the  concept  taxonomy. 

This  paper  introduces  the  syntax  and  semantics  of 
that  portion  of  the  Loom  knowledge  representation  lan¬ 
guage  which  represents  meta-level  knowledge.  We  in¬ 
clude  discussions  on  some  of  the  types  of  inference  which 
can  be  performed  by  the  Loom  system.  We  begin  by 
defining  the  four  broad  types  of  knowledge  managed  by 
the  Loom  system,  and  then  discuss  each  of  the  "Boxes" 
devoted  to  representing  meta-level  knowledge.  The  ap¬ 
pendices  include  the  knowledge  bases  used  to  illustrate 
examples  of  Loom  syntax.  A  longer  version  of  this  paper 
[Mac  Gregor  87]  contains  a  complete  definition  of  the 
Loom  system. 

2.  Boxes 

In  order  to  accurately  define  concepts  and  relations 
in  Loom,  it  is  necessary  to  have  an  understanding  of  how 
Loom  treats  various  "kinds"  of  knowledge  within  the  sys¬ 
tem.  Loom  partitions  its  knowledge  space  into  four 
"Boxes",  called  the  Terminological,  Universal,  Default, 
and  Assertional  Boxes.  This  section  presents  a  brief 
characterization  of  each  of  these  four  kinds  of  knowledge. 
Later  sections  will  present  specifics  on  the  expressive  fea¬ 
tures  available  with  each  of  the  Loom  boxes. 

Definitions  within  the  "Terminological  Box"  (TBox) 
serve  to  define  the  "terms"  in  our  knowledge  represen¬ 
tation  scheme  (  [Brachman,  Fikes,  and  Levesque  83]  con¬ 
tains  a  good  discussion  of  what  kind  of  knowledge  is  con¬ 
sidered  to  be  "terminological").  A  TBox  definition  yields 
a  set  of  necessary  and  sufficient  conditions  for  recogniz¬ 
ing  an  instance  of  some  concept.  Within  Loom,  the  or¬ 
ganization  (classification)  of  concepts  is  based  strictly  on 
the  terminological  knowledge  available  to  the  system. 


The  "Universal"  Box  (UBox)  widens  the  scope  of 
things  we  can  say  about  (generic)  concepts  to  include  cer¬ 
tain  forms  of  knowledge  about  the  "real  world".  In  the 
UBox  we  can  attach  necessary  conditions  to  a  concept 
definition.  For  example,  we  can  state  that  "live-persons 
necessarily  have  heads",  i.e., 

Wx[Live— Person(x)  — *■  3y  head(x,  y)]. 

In  the  UBox  we  can  also  state  conditions  which  are  suf¬ 
ficient,  but  not  necessary  to  recognize  an  instance  of  a 
concept.  For  example,  we  can  say  that  "all  featherless 
bipeds  are  human",  i.e., 

'ix[Featherleas—Biped{x)  —*  Human(x)\. 

A  second,  more  powerful  classifier  is  associated  with 
the  UBox.  The  UBox  classifier  makes  its  inferences 
(classifications)  on  the  basis  of  combined  TBox  and  UBox 
knowledge. 

The  "Default"  Box  is  the  proper  location  for 
representing  "assumptions"  or  "default  knowledge".  For 
example,  in  it  we  can  state  such  things  as  default  values: 
"If  nothing  has  been  asserted  about  the  color  of  some 
elephant  x,  make  the  assumption  'color(x  Grey)’."  We 
can  also  state  some  limited  forms  of  closed-world  assump¬ 
tions:3  "If  some  paper  P  has  K  authors,  assume  that  it 
has  only  K  authors." 

The  knowledge  represented  in  the  Default  Box  is 
used  to  make  some  very  limited  types  of  inferences 
during  the  process  of  realization.  A  full-blown  use  of 
default  knowledge  would  seem  to  require  the  inclusion  of 
a  non-monotonic  reasoning  capability  into  Loom.  This  is 
beyond  the  scope  of  our  current  effort. 

The  Assertional  Box  (ABox)  is  the  repository  for 
assertions  about  individuals.  For  example,  we  might 
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By  default, the  ABox  assumes  open-world  semantics 
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place  in  the  ABox  the  assertion  that  Clyde  is  a  white 
elephant  by  making  the  assertions: 

(assert  (Elephant  Clyde)  (color  Clyde  white)) . 

The  effect  of  these  assertions  is  to  create  an  instance  in 
the  ABox  of  the  concept  Elephant  (unless  Clyde  already 
exists  in  the  ABox)  and  to  assign  to  the  color  role  of  the 
object  Clyde  the  value  White. 

Loom  has  extended  NIKL’s  terminological  language 
CNIKL  [Robins  86]  to  include  expressions  of  universal 
and  default  knowledge.  We  believe  that  it  is  beneficial  to 
associate  each  fragment  of  universal  or  default  knowledge 
with  a  particular  concept;  thus,  we  have  chosen  to  extend 
the  syntax  of  the  original  defconcept  (and  defrelation) 
primitives,  rather  than  to  add  new  (top-level)  constructs 
to  the  terminological  language.  The  Engines  and  Cars 
knowledge  base  in  Figure  A-l  illustrates  some  Loom  con¬ 
cept  declarations.  The  original  CNIKL  definition  of  a 
concept  serves  as  its  definitional  component.  An 
"axioms"  clause  states  universal  knowledge  about  a  con¬ 
cept,  while  a  "defaults"  clause  states  default  knowledge. 

Engineering  Note: 

Our  introduction  of  a  new  type  of  reasoner  (the 
UBox  classifier)  puts  us  in  line  with  what  we  see  as  a 
long-range  trend  towards  knowledge  representation  ar¬ 
chitectures  which  will  employ  increasing  numbers  of  spe¬ 
cialized  reasoners.  As  the  number  of  reasoners  within  a 
single  system  increases,  it  will  become  increasingly  impor¬ 
tant  that  some  organizing  principle  is  available  to  in¬ 
tegrate  these  various  reasoners.  Our  decision  to  organize 
all  universal  and  default  knowledge  within  the  context  of 
particular  concepts  illustrates  a  belief  that  the  "concept- 
oriented"  (a.k.a.  "frame-oriented")  approach  will  prove 
to  be  a  successful  organizing  principle  for  wider  and 
wider  classes  of  knowledge.  Such  an  approach  may  be 
contrasted  with  that  of  the  current  generation  of  rule- 
bas^d  systems  (including  hybrid  frame-  and  rule-based 
systems);  in  those  systems,  knowledge  which  we  have 
classed  as  universal  or  default  knowledge  (other  than 


"default  values")  tends  to  be  dumped  unceremoniously 
into  a  "rule  base",  i.e.,  such  systems  provide  no  formal 
scheme  for  structuring  that  knowledge. 

3.  Basic  Terminology 

Here  we  take  time-out  to  formalize  some  of  our 
terms. 

By  a  concept  we  mean  an  "intentional  description" 
of  something.  The  most  general  instance  of  a  concept  is 
called  "Thing".  A  relation  is  a  concept  which  defines  a 
set  of  k-tuples,  with  k  being  fixed  for  each  individual 
relation.  By  convention,  the  the  term  "concept"  is  often 
used  to  refer  to  (the  more  specialized  notion  of)  a  unary 
relation.  Thus,  he  defconcept  form  defines  a  unary 
relation. 

A  binary-relation  for  which  the  roles  domain  and 
range  have  been  assigned  will  be  called  a  mapping.  By 
convention,  the  term  "relation"  may  be  used  in  place  of 
the  word  "mapping",  and  the  form  defrelation  is  used 
to  define  a  mapping.  The  most  general  instance  of  a 
mapping  is  called  "maps-to".4  The  Loom  implemen¬ 
tation  is  intended  to  accommodate  relations  of  order 
greater  than  two,  but  a  complete  syntax  for  defining 
higher-order  relations  has  not  yet  been  worked  out.  A 
relation  which  has  been  reified  (equated  with  a  unary 
concept  of  the  same  name)  is  termed  a  relationship. 

The  domain  of  a  mapping  is  not  considered  to  be  a 
part  of  its  (TBox)  definition.  The  association  of  a  map¬ 
ping  with  a  particular  (domain)  concept,  other  than  the 
concept  THING,  induces  a  sub-relation  we  call  a  role.5  A 
role  restriction  which  associates  a  mapping  M  with  a 
concept  C  defines  a  role  R^,/  such  that  R is  a  subset 
of  M,  and  has  domain  C.  A  value  restriction  is  a  role 

4"inaps-to"  corresponds  to  the  NIKI,  relation  "MostGencraUtole" . 

^Roles  are  seen  as  virtual  objects  in  Loom,  i.e,,  there  are  no  struc¬ 
tures  in  the  system  which  can  be  identified  as  roles. 


restriction  which  restricts  the  range  of  R^^,  while  a 
number  restriction  is  a  role  restriction  which  places 
bounds  on  the  number  of  role  fillers  of  R^^  that  can  be 
associated  with  a  single  instance  of  C.  A  composition  of 
mappings  Mp  Mk  such  that  the  domain  of  Mj  is 
restricted  to  a  particular  concept  (other  than  THING)  is 
called  a  role  chain. 

Loom  distinguishes  between  "primitive"  and 
"defined"  concepts  (and  relations).  A  concepts  is 
primitive  if  no  complete  definition  can  be  given  for  it 
(see  [Vilain  84,  p.  549  or  ,  Brachman  and  Schmolze  85]); 
otherwise  it  is  defined.  Concepts  and  relations  are  or¬ 
ganized  into  a  taxonomy  based  on  a  partial-ordering  rela¬ 
tion  called  "specializes".  A  concept  Cx  specializes  a  con¬ 
cept  C2  if  and  only  if  membership  in  Cj  entails  member¬ 
ship  in  C2,  i.e.  iff 

1=  Vz[C1(ar)  C2(x)]. 

An  instance  of  a  specializes  relation  between  two  concepts 
may  be  declared  explicitly  in  a  concept  definition,  or  it 
may  be  deduced  by  the  classifier. 

A  value  is  an  object  which  corresponds  to  a  logical 
constant  in  a  knowledge  base,  and  is  typically  left  un¬ 
defined  in  a  knowledge  base.  The  numbers  1,  3,  and  8.2, 
and  the  sexes  Male  and  Female  are  examples  of  values. 
A  concept  which  is  defined  by  enumerating  its  instances 
is  called  a  set.  Currently,  all  of  the  sets  we  define  in  the 
TBox  are  sets  of  values.  Number  and  Sex  are  examples 
of  sets.  A  (denumerable)  set  for  which  precedessor  and 
successor  relations  exist  is  termed  an  interval,  e.g.,  In¬ 
teger  and  Days-of-the-Week  are  intervals. 

To  classify  a  concept  means  to  link  it  into  the 
specialization  lattice  so  that  (i)  it  is  below  all  concepts 
which  it  specializes,  and  (ii)  it  is  above  all  concepts  which 
specialize  it.  The  most  specific  generalization  (MSG)  of 
a  concept  is  the  set  of  those  concepts  which  are/would 
become  its  direct  ancestors  (parents)  if  it  were  classified. 


To  recognize  an  ABox  object/instance  x  means  to 
compute  the  set  of  concepts  {CJ  such  that  for  each  Cj,  x 
is  an  instance  of  Cj,  and  x  is  not  an  instance  of  any  des¬ 
cendant  of  C;.  The  set  {Cj}  is  referred  to  as  the  MSG  of 
x.  In  an  informal  discussion  we  may  use  the  term 
"classification"  to  refer  to  either  the  classification  or  the 
recognition  process. 

4.  The  TBox 

In  this  section  we  present  the  syntax  and  semantics 
of  TBox  definitions  for  (unary)  concepts,  (mapping)  rela¬ 
tions,  sets  and  intervals.  Occasionally  within  this  discus¬ 
sion  we  will  pause  to  point  out  some  of  the  deductions 
which  the  Loom  classifier  will  (or  will  not)  be  able  to 
make.  These  comments  are  intended  to  foster  an  ap¬ 
preciation  for  what  kinds  of  inference  one  can  expect 
from  a  classifier.  Next  comes  a  brief  discussion  outlining 
our  reasons  for  prohibiting  cyclically-defined  concepts, 
and  we  conclude  with  a  presentation  of  three  additional 
restrictions  which  Loom  imposes  on  TBox  definitions. 

4.1.  Defconcept  and  Defrelation 

A  formal  semantics  for  the  term-forming  operations 
defconcept  and  defrelation  appears  as  Appendix  B.  The 
simple  definitional  constructs  listed  in  the  figure  can  be 
combined  within  a  concept  or  relation  definition  to  form 
compound  definitions.  The  semantics  for  such  a  com¬ 
pound  definition  are  defined  as  the  logical  conjunction  of 
the  individual  lambda  definitions. 

For  example,  referring  to  the  Engines  and  Cars  I<B 
in  Figure  A-l,  suppose  we  declare  a  new  concept 

(defconcept  (: specializes  Engine) 

(: restriction  cylinders  (:min  4)  (max  6)) 

(: restriction  fuel  (:vr  Gasoline))). 

This  concept  means  "an  engine  fueled  by  gasoline  which 

has  between  4  and  6  cylinders."  The  TBox  classifier  will 

discover  that  this  concept  specializes  the  concept  labeled 

Internal-Combustion-Engine. 
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The  Familial  Relations  KB  in  Figure  A-4  illustrates 
how  defrelatlon  constraints  can  be  combined  to  form 
terms  for  the  relations  parent,  father,  grandfather,  etc. 
The  classifier  will  determine,  among  other  things,  that 
grandfather  specializes  grandparent  and  that  parent  and 
grandparent  specialize  ancestors.  A  few  short-hand  nota¬ 
tions  are  provided  in  addition  to  the  operators  illustrated 
ir.  Figure  A-4.  The  following  pairs  of  forms  are  equiv¬ 
alent: 

The  forms 

(.restriction  M  (-.number  k))  and 

(: restriction  M  (:mln  k)  (:max  k)), 

the  forms 

(: restriction  (:vrdlff  M  C)  .  .  .)  and 

( : restriction 

(defrelatlon  (: specializes  M)  (: range  C))  ...), 
the  forms 

(: restriction  (:closure-of  M)  . . .)  and 
(: restriction  (defrelatlon  (:closure-of  M))  .  .  .)  . 

Loom’s  constraint  clause  extends  the  CNIKL  con¬ 
struct  referred  to  as  a  "role-constraint"  or  "role-value- 
map"  by  (1)  allowing  for  other  operators  than  just  set- 
equality  and  set-containment,  and  (2)  allowing  a  value  to 
take  the  place  of  a  rolc-chain.  The  argument  "CP"  in 
the  clause 

(: constraint  CP  (...)  (...)) 
must  name  a  relation  which  falls  in  the  sublattice  rooted 
at  the  relation  Compute-Relation.  Figure  A-2  illustrates 
some  compute  relations.  The  operators  for  computing 
set-equality,  set-inequality,  and  set-containment  are  other 
examples  of  compute  relations. 

Again  referring  to  the  Engines  KB,  let  us  declare 
two  new  concepts: 

(defconcept  Big -Engine 

(: constraint  greater- than  (horse-power)  120)) 
(defconcept  Very-Blg-Englne 

(.  constraint  greater-than  (horse-power)  200)) 

We  plan  to  upgrade  the  Loom  classifier  so  that  it  will  be 
able  to  deduce  that  Very  Big-Engine  specializes 
Big  Engine.  The  analysis  will  necessitate  recognizing  the 


truth  of  (greater-than  200  120)  ,  and  will  involve 
reasoning  about  the  transitivity  of  the  greatrr-than  rela¬ 
tion.  During  a  198A  NIKL  users  workshop  [Moore  86], 
Ron  Drachman  discussed  the  possibility  of  extending  a 
NIKL-like  system  to  include  a  couple  of  new  "boxes"  in 
addition  to  the  traditional  TBox  and  ABox.  One  of  those 
boxes  he  termed  a  "Mathematics  Box",  which  would  be  a 
specialized  reasoner  with  the  ability  to  derive  mathemati¬ 
cal  inferences  in  conjunction  with  the  TBox  reasoner. 
The  numerical  reasoning  facility  just  hinted  at  represents 
an  embryonic  step  in  the  direction  of  developing  a  full- 
fledged  mathematics  box. 

We  will  conclude  this  section  will  an  example  con¬ 
taining  definitions  for  which  Loom  cannot  deduce  the  im¬ 
plied  subsumption  relations.  Referring  to  the  Familial 
Relations  KB  again,  consider  the  following  definitions  of 
a  concept  named  "Only-Child": 

(defconcept  Only-Chlld-1 

( : constraint  equals  self  (parent  child))) 
(defconcept  Only-Chlld-2 

(: restriction  siblings  (:max  0))). 

The  current  Loom  classifier  cannot  deduce  that  the  con¬ 
cepts  Only-Chlld-l  and  Only-Child-2  are  equivalent. 
The  NIKL  classifier  is  similarly  unable  to  deduce  this 
equivalence  relation  (when  applied  to  CNIKL  analogues 
of  the  above  definitions).  Our  current  development 
philosophy  is  that  we  are  committed  to  developing  a  sys¬ 
tem  which  makes  inferences  which  are  sound,  but  not 
necessarily  complete.  One  of  the  philosophical  goals  of 
the  Loom  system  is  to  investigate  empirically  where  the 
boundaries  should  be  on  the  expressive  power  of  a  TBox. 
Once  those  bounds  have  been  more-or-less  established,  it 
may  be  appropriate  to  revive  the  goal  of  developing  a 
reasoner  which  is  as  complete  as  we  can  make  it. 

4.2.  Defset  and  Definterval 

This  section  describes  the  operators  defset  and 
definterval,  which  can  be  employed  to  define  sets  and 
intervals,  and  also  to  define  concepts  corresponding  to 
the  values  enumerated  in  those  sets/intervaks.  Our  ex- 
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amples  will  reference  the  Sets  and  Intervals  KB  in  Figure 
A-3. 

In  many  cases,  there  is  a  tight  coupling  between 
values  in  a  set  or  interval  which  represent  "qualities" 
(e.g.,  the  sex  Male  or  the  color  Red)  and  concepts  such  as 
Male-Animal  or  Rod-Thiag  which  are  defined  by  having 
one  of  their  attributes  restricted  to  the  corresponding 
value:  Definitions  for  Male-Animal  and  Red-Thing  might 
be 

(def concept  Male-Animal  (: specializes  Animal) 

(: restriction  sex  (:vr  Male))) 

(defconcept  Red-Thlng 

(specializes  Monochrome-Thing) 

(: restriction  color  (:vr  Red))). 

Thus,  we  have 

Vx[(Animal(x)  A  sex(x,  Male))  *-*  Male-Animal(x)]. 
Vx\{Monochrome—Thing(x)  A  color(x,  Male)) 

*-*  Red-Thing(x)). 

The  Loom  syntax  for  sets  and  intervals  includes  an  op¬ 
tional  "partitions"  clause  which  produces  the  set  of 
definitions  needed  to  characterize  this  behavior. 

The  declaration 

(defsat  Sex  (: values  Male  Female)) 
defines  a  set  Sex  and  the  values  Male  and  Female.  To  in¬ 
troduce  the  concepts  Male-Animal  and  Female-Animal,  we 
can  augment  our  definition  with  the  clause 
(partitions  Animal)  (Figure  A-3  illustrates  the  com¬ 
plete  definition).  This  larger  declaration  implicitly 
declares  the  following  expressions: 

(defrelation  Sex  :primltlve 

(: axioms  (domain  Animal)  (: range  Sex))) 
(defconcept  Male-Animal  (: specializes  Animal) 

(: restriction  Sex  Male)) 

(defconcept  Female-Animal  (: specializes  Animal) 

(: restriction  Sex  Female)) 

In  addition,  the  declaration  for  the  concept  Animal  is  aug¬ 
mented  by  a  clause  which  indicates  that  Male-Animal  and 
Female-Animal  form  a  disjoint  covering  of  Animal. 

We  next  turn  our  attention  to  the  interval 
Naval-Rank  defined  in  Figure  A-3.  The  declaration  of 
Naval-Rank  implies  the  definition  of  a  relation 
Naval  Rank  and  also  implies  the  declaration  of  the  con¬ 


cepts  Seaman-Recruit,  Seaman-Apprentice,  ...  ,  Admiral. 

6  The  implied  declaration  for  Admiral  is 

(defconcept  Admiral  (: specializes  Naval -Person) 

(: restriction  Naval-Rank  Admiral)).7 

Berause  Naval-Rank  is  specified  as  an  interval, 
rather  that  as  a  set,  the  relations  "successor"  and 
"predecessor"  are  defined  for  its  instances.  Their  defini¬ 
tion  corresponds  to  the  order  of  values  in  the  "values" 
clause.  For  example,  (successor  Commander  captain)  is 
true.  The  successor  and  predecessor  relations  may  ap¬ 
pear  within  the  role  chains  of  a  constraint  clause.  A 
square-bracket  notation  can  be  employed  to  define  a 
(contiguous)  subset  of  an  interval.  This  is  illustrated  in 
the  definition  of  the  set  Naval-Officer-Rank,  and  in  the 
definitions  below  that  for  Natural-Number, 
Positive-Integer,  and  Non-Negative-Integer.  The 
semantics  of  subsumption  for  intervals  is  the  same  as 
that  for  sets.  For  example,  the  interval  defined  by 

(def interval  (-.specializes  Integer) 

( : values  375)) 

specializes  the  interval  defined  as 

(def interval  (.specializes  Integer) 

(: values  [2. .9])) . 

4.3.  How  to  Avoid  Cycles 

A  concept  (or  relation)  definition  depends  on 

another  definition  if  it  references  the  other  concept  by 
name  within  its  definition.  If  these  depends-on  links 
form  a  cycle,  then  we  say  that  the  definitions  involved 
are  cyclic.  The  designers  of  the  NIKL  system  expressly 
permitted  cyclic  definitions.  However,  the  semantics  as¬ 
sociated  with  cyclic  CNIKL  definitions  v  as  never  fully 
worked  out,  and  the  behavior  of  the  NIKL  classifier  when 
it  encountered  cycles  was  far  from  satislsctory.  Loom 
has  taken  an  opposite  position  —  cyclic  definitions  are  il¬ 
legal  in  Loom. 

^In  the  declaration  of  "Naval-Rank",  the  clause  "(:suffix  Nil)" 
prevented  the  suffix  "-Naval-Person"  from  being  appended  to  each 
new  concept. 

^Observe  that  the  concepts  "admiral  as  person"  and  "admiral  as 
naval-rank"  have  the  same  name.  Loom  wilt  automatically  add  suf¬ 
fixes  "-1"  and  "-2“  to  distinguish  between  them. 
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A  primary  motivation  for  allowing  cycles  was  to 
avoid  placing  a  restriction  on  wliat  concepts  could  appear 
within  a  value  restriction  clause.  Consider  the  following 
definition  of  Human: 

(def  concept  Human  primitive  ( :  sped  alizes  Mammal) 

(  restriction  parents  (:vr  Human))) 

The  value  restriction  (:vr  Human)  allows  the  system  to 

infer  "If  an  individual  is  Human,  then  so  are  its  parents, 

and  their  parents,  and  so  on."  Because  that  value 

restriction  is  self-referential  (defining  a  cycle  of  length 

one),  it  is  not  permitted  in  Loom.  However,  Loom  does 

allow  an  equivalent  restriction  to  be  expressed  as  an 

axiom  in  the  UBox: 

(def concept  Human  :prlmltlve  (: specializes  Mammal) 

( : axioms 

(: restriction  parents  (:vr  Human)))) 

Thus,  we  retain  in  Loom  the  ability  to  make  statements 
such  as,  "the  parents  of  humans  are  also  human";  we 
just  don’t  allow  them  to  be  included  as  a  part  of  the 
(terminological)  definition  of  a  concept. 

5.  The  UBox 

The  knowledge  which  we  place  in  the  Universal  box 
augments  individual  TBox  definitions  with  what  we  call 
universal  or  contingent  knowledge.  The  expressive  power 
of  the  Loom  language  increases  significantly  when  the 
definitional  language  is  extended  to  include  expressions  of 
universal  knowledge.  This  combined  language  admits  a 
correspondingly  larger  class  of  inferences. 

This  section  will  first  define  the  different  types  of 
knowledge  which  we  class  as  "universal".  Next,  we  in¬ 
troduce  the  notion  of  a  "stable"  classifier,  which  serves 
to  sharpen  the  definitional  boundary  between  ter¬ 
minological  and  universal  knowledge.  Finally,  we  will 
present  the  representational  model  and  classification  algo¬ 
rithm  adopted  by  the  Loom  architecture  to  handle 
universal  knowledge. 

5.1.  Types  of  Universal  Knowledge 

In  anticipation  of  our  later  discussion  on  how  Loom 
represents  universal  knowledge,  we  will  group  our  univer¬ 


sal  knowledge  into  four  categories.  Referring  to  universal 
knowledge  that  is  attached  to  a  concept  "P",  the 
categories  are: 

1.  Contingent  restrictions  ard  constraints  — 
these  are  restrictions  or  constraints  which 
necessarily  apply  to  an  instance  "x"  if  P(x) 
holds.  These  are  often  called  "necessary 
conditions"; 

2.  Implications  --  these  are  statements  of  the 
form  "P  implies  Q"  (where  Q  is  a  concept 
which  does  not  subsume  P).  Often  called 
"sufficient  conditions"; 

3.  Equivalences  --  these  are  statements  of  the 
form  "P  if  and  only  if  Q".  Often  called 
"necessary  and  sufficient  conditions"; 

4.  Other  non-definitional  knowledge  about  con¬ 
cepts  and  relations.  Currently  this  knowledge 
consists  of  covering  relations,  disjointness  rela¬ 
tions,  marking  concepts  as  "individual",  and 
domain  and  range  constraints  on  mappings. 

5.1.1.  Contingent  Restrictions  and  Constraints 

The  "axioms"  clause  of  a  concept  or  relation  defini¬ 
tion  states  universal  knowledge  which  applies  to  that  con¬ 
cept  or  relation.  The  Engines  and  Cars  KB  of  Figure  A-l 
illustrates  several  such  clauses.  The  next  few  examples 
will  be  drawn  from  that  KB. 

The  clause 

(: axioms  (:res  (:vrdlff  has-component  Engine) 

(: number  1))) 

which  appears  in  the  definition  of  Car  is  an  example  of  a 
"contingent  restriction".  The  meaning  of  the  clause  is 

Vx[Car(x)  -+  3  exactly  one  y 

( has—component(x ,  y)  A  Engine(y))\. 

This  is  sometimes  referred  to  as  a  "necessary  condition" 
because  it  translates  as  "it  is  necessarily  the  case  that  a 
car  has  exactly  one  engine."  In  general,  the  meaning  of  a 
restriction  (or  constraint)  appearing  within  an  "axioms" 
clause  of  a  def  concept  form  defining  a  concept  C  is,  "this 
restriction  (constraint)  applies  to  all  objects  which  are  in¬ 
stances  of  C". 
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5.1.2.  Implication?  and  Equivalence  Relations 

The  clause  (: axioms  (: implies  Car))  which  ap¬ 
pears  within  the  defconcept  form  which  defines 
Battery-Powered-Vehicle  is  an  example  of  an 
implication.  Its  meaning  is 
\/x[Battery— Powered— Vehicle(x)  — >  Car(x)}. 

This  form  of  knowledge  is  sometimes  called  a  "sufficient 
condition"  because  it  can  be  translated  as  "to  determine 
is  x  is  an  Car,  it  is  sufficient  to  determine  that  x  is  a 
Battery-Powered-Vehicle." 

It  is  important  to  distinguish  the  difference  in 
semantics  between  an  implication  (an  "implies"  relation) 
and  a  "specializes"  relation.  While  the  logical  form  as¬ 
sociated  with  each  of  them  is  identical,  the  semantics  of 
the  specializes  relation  is  significantly  stronger.  The 
statement  "B  specializes  A"  says  not  only  that  (1)  B 
implies  A,  but  also  that  (2)  B’s  (TBox)  definition  in¬ 
cludes  the  definition  cf  A,  and  (3)  B  inherits  the  (UBox) 
properties  of  A. 

A  two-way  implication  established  between  a  pair 
of  concepts  defines  an  equivalence  relation.  More 
generally,  any  cycle  of  implications  through  a  set  of  con¬ 
cepts  establishes  an  equivalence  relation  between  each 
pair  of  concepts  in  that  set.  Suppose  a  set  of  concepts 
{Cj}  have  been  defined  such  that  they  are  pairwise- 
equivalent.  While  the  TBox  sees  the  Cj  as  distinct  con¬ 
cepts,  the  UBox  view  of  this  knowledge  sees  a  single  con¬ 
cept  Cjj  which  combines  all  of  the  knowledge  declared  in 
each  of  the  Cj  (this  is  described  in  more  detail  in  section 
5.3.).  This  means  that  universal  knowledge  (other  than 
the  "implies"  relations)  can  be  distributed  in  any  number 
of  ways  among  the  Cj’s,  and  the  semantics  will  always  be 
the  same. 

The  preferred  way  to  model  a  set  of  equivalent  con¬ 
cepts  (Cj)  is  to  explicitly  declare  an  additional  concept  C 
which  specializes  each  of  the  Cj,  and  which  contains  all  of 
the  universal  knowledge  associated  with  the  Cj,  except 
for  a  clause  (  axioms  (  implies  C))  which  appears  in 


each  of  the  Cj  definitions.  Our  definition  of  the  concepts 
Diesel-Oil-Engine,  Thing-With-Glow-Plugs, 
Very-High-Compression-Engine,  and  Diesel-Engine  in 
Figure  A-l  illustrates  this  type  of  modeling. 

5.1.3.  Coverings  and  Disjointness  Classes 

A  covering  for  a  concept  "A"  is  a  set  of  concepts 
whose  union  contains  A.  Loom  syntax  requires  that  the 
concepts  within  such  a  covering  specialize  A,  so  that  the 
union  of  the  covering  concepts  equals  A.  The  meaning  of 
the  clause  (:  axioms  (:  covering  B  C))  within  a 
defconcept  for  A  is 
Vx[A(x)  — ►  (B(i)Vf(z))]. 

Declarations  of  unary  coverings  (coverings  containing  a 
single  concept)  are  illegal  in  Loom  because  they  are  logi¬ 
cally  equivalent  to  "implies"  relations,  and  hence  are 
redundant. 

A  disjointness  class  is  a  set  of  concepts  which  are 
declared  to  be  mutually  disjoint.  A  disjointness  class  is 
always  defined  with  respect  to  a  concept  which  subsumes 
the  members  of  the  class.  The  meaning  of  the  clause 
(: axioms  (:dis)oint  B  C)) 
within  a  defconcept  for  A  is 
Vi[B(x)  <-»  ->C(x)]. 

A  disjoint-covering  of  a  concept  A  enumerates  a  set 
of  concepts  which  partition  A,  i.e.,  it  is  interpreted  as  the 
logical  conjunction  of  a  covering  declaration  and  a  dis¬ 
jointness  declaration. 

The  Numeric-Comparison  KB  in  Figure  A-2  il¬ 
lustrates  some  declarations  of  coverings  and  disjoint- 
coverings.  The  covering  defined  for  the  relation 
numeric-comparison  declares  that  the  relations 
greater-or-equal  and  less-or-equal  cover 
numeric-comparison.  The  disjoint-covering  declaration 
for  greatur-cr -equal  states  both  that  greater-or  equal 
is  covered  by  greater-than  and  equal,  and  that  the  rela¬ 
tions  greater-than  and  equal  are  disjoint.  Loom 
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about  a  relation.  The  declaration 


provides  functions  for  asking  questions  about  (declared  or 
derived)  disjointness  and  covering  relations,  such  as  "Are 
concepts  A  and  B  disjoint?",  "Do  concepts  A,  B,  and  C 
cover  concept  D?",  or  "List  all  coverings  for  concept  D". 

Loom  requires  that  the  concepts  or  relations  ap¬ 
pearing  in  a  covering,  disjointness  class,  or  disjoint- 
covering  must  all  be  primitive.  The  philosophical  jus¬ 
tification  for  this  restriction  is  that  if  one  or  more  of  the 
members  of  the  covering  and/or  disjointness  class  are  not 
primitive  then  either  (i)  the  covering  and/or  disjointness 
relations  could  have  been  logically  inferred  on  the  basis 
of  other  knowledge  or  (ii)  such  relation(s)  could  be 
^derived.  In  the  former  case,  the  covering 
and/disjointness  declaration  is  redundant,  and  should  be 
dropped.  In  the  latter  case,  there  must  have  been  some¬ 
thing  left  unstated  about  the  non-priinitive  concepts, 
which  suggests  that  they  are  in  fact  primitive. 

The  implementors  of  the  NIKL  system  encountered 
a  practical  reason  for  requiring  members  of  a  disjointness 
class  to  be  primitive.  That  restriction  prevented  an 
anomaly  which  arose  in  a  situation  in  which  so-called 
"incoherent"  concepts  were  being  classified.  The  pos¬ 
sibility  of  a  similar  anomaly  arising  in  conjunction  with 
covering  declarations  has  not  yet  been  explored. 

We  are  considering  omitting  the  disjoint  clause  al¬ 
together  from  the  Loom  language,  owing  to  the  obser¬ 
vation  that  we  have  not  yet  encountered  the  use  of  a  dis¬ 
jointness  declaration  in  a  context  where  an  obvious  cover¬ 
ing  relation  did  not  also  exist,  i.e.,  where 
dlsjolnt-coverlng  could  not  have  been  substituted.  Our 
syntactic  requirement  that  a  disjointness  class  be  defined 
relative  to  a  particular  concept  anticipates  this  future 
restriction. 

5.1.4.  Domain  and  Range  Restrictions 

"Domain"  and  "range"  clauses  which  appear 
within  an  "axioms"  clause  state  necessary  conditions 


(defrelation  M  . . . 

(: axioms  (: domain  A)  (: range  B))) 

makes  the  universal  statement 
Vxy[M{x,y)  -+  A(x)  A  B(y)\. 

Knowledge  about  domain  and  range  constraints  is 
referenced  during  the  "model-building"  phase,  when  the 
initial  definitions  of  concepts  and  relations  are  being 
refined  and  checked  for  coherence.  In  this  context,  these 
constraints  function  as  "integrity  constraints". 

5.1.5.  Individual  Concepts 

Marking  a  concept  as  "individual"  means  that  its 
extension  has  cardinality  at  most  one.  We  have  iden¬ 
tified  some  inferences  that  can  be  made  on  the  basis  of 
individual  markings  on  concepts,  but  none  of  these  in¬ 
ferences  are  particularly  useful.  Thus,  this  feature  cur¬ 
rently  serves  only  as  a  place-holder,  awaiting  a  user  who 
will  conceive  of  a  use  for  it. 

The  presence  of  the  "individual"  marking  is  a  part 
of  Loom’s  NIKL  heritage.  Because  most  applications  of 
NIKL  operated  without  an  ABox  --  individual  concepts 
served  in  lieu  of  real  ABox  instances. 

5.2.  Stable  and  Non-Stable  Classifiers 

Consider  the  following  scenario:  A  rather  shady- 
looking  character  produces  from  his  capacious  overcoat  a 
large  black  box,  which  he  claims  is  a  seventh-generation 
classifier  of  terminological  knowledge,  guaranteed  to 
produce  sound  (although  not  necessarily  complete)  in¬ 
ferences  very  quickly.  We  decide  to  test  out  his  BBC 
(black-box  classifier).  First  we  store  into  the  BBC1  the 
definitions  of  two  concepts  which  we  call  A  and  B.  We 
than  ask  the  BBC  "Docs  B  specialize  A?",  and  it 
responds  (very  quickly)  "No".  Next,  we  enter  a  few 
more  definitions  into  the  BBC,  and  again  ask  the  BBC 
"Does  B  specialize  A?"  This  time,  its  rapid  rejoinder  is 
"Yes"!  Should  we  buy  his  BBC  (his  price  is  very 
reasonable)? 


The  answer  is  no:  Let  us  define  a  stable  classifier 
(or  recognizer)  to  be  one  which  produces  the  same 
answers  to  subsumption  questions  independently  of  ad¬ 
ditions  or  subtractions  to/from  the  knowledge  base  (here 
we  assume  that  no  concept  definitions  are  modified ,  and 
that  at  no  time  does  the  knowledge  base  contain  un¬ 
defined  references).  "Stability"  is  a  highly-desirable  fea¬ 
ture  in  a  TBox,  because  it  provides  a  certain  guarantee 
that  when  TBox  knowledge  is  shared  across  several 
knowledge  bases  (e.g.,  by  several  applications)  it  will 
retain  the  same  "meaning"  in  each  of  those  contexts. 
We  propose  that  "stability"  be  considered  a  test  which 
serves  to  exclude  some  reasoners  from  being  considered 
TBox  classifiers.  ("Soundness"  should  be  another  TBox 
requirement).  The  Loom  Tbox  of  an  example  of  a  stable 
classifier;  our  friend’s  BBC  is  not  stable. 

The  Loom  UBox  classifier/recognizer  is  not  stable! 
Consider  the  Cars  KB  in  Figure  A-l.  Suppose  we  make 
the  following  assertions 

(assert  (Motor-Vehicle  BPV)  (2-Person-Vehlcle  BPV) 
(Battery-Powered-Englne  E) 

(has-component  BPV  E)) 

Now  we  ask,  "Is  BPV  an  instance  of  2-Person-Car?"  The 
UBox  recognizer  will  make  the  following  inferences 
(Battery-Powered-Vehlcle  BPV) 

(Car  BPV)  because  of  the  "implies"  axiom 

(2-Person-Car  BPV) 

and  conclude  "Yes".  However,  if  we  remove  the  defini¬ 
tion  for  Battery-Powered-Vehlcle  (or  if  it  never  existed) 
and  re-run  the  UBox  recognizer,  it  will  not  conclude  ei¬ 
ther  (Car  BPV)  or  (2-Person-Vehicle  BPV)  .8  On  the 
other  hand,  if  we  run  the  Loom  TBox  recognizer  on  the 
same  knowledge  base  and  assertions,  it  will  fail  in  both 
cases  to  recognize  that  BPV  is  a  car  (or  a  2-person  car). 
This  behavior  occurs  because  the  axiom 
"Battery-Powered-Vehlcle  implies  Car"  is  invisible  to  the 


8 Not c:  This  does  not  mean  that  it  concludes  “-'(Car  BPV)1'.  It 
merely  fails  to  infer  "(Car  BPV)"  -  the  UBox  classifier  is  not  a  non¬ 
monotonic  reasoner. 


TBox.  The  stability  of  the  TBox  classifier  derives  from 
the  restrictions  we  place  on  what  kinds  of  knowledge  are 
classed  as  terminological  in  the  TBox,  not  from  the  par¬ 
ticular  inference  algorithm  chosen  --  we  deliberately  ex¬ 
clude  from  the  TBox  classes  of  knowledge  which  intro¬ 
duce  non-stable  behavior. 

5.3.  Modeling  and  Classification  of  Universal 
Knowledge 

This  section  represents  a  long  engineering  note.  We 
first  describe  the  internal  model  adopted  by  Loom  to 
represent  universal  knowledge,  and  then  give  some  in¬ 
sight  into  the  workings  of  Loom’s  UBox  classification  al¬ 
gorithm. 

In  a  Loom  concept  network,  separate  objects,  which 
we  shall  refer  to  as  CT  and  Cy,  are  defined  to  represent 
the  TBox  and  UBox  knowledge  associated  with  a  single 
concept  C.9  CT  contains  exactly  the  definitional 
(terminological)  component  of  C.  Cy  contains  both  the 
definitional  and  contingent  knowledge  knowledge  as¬ 
sociated  with  C.  Thus,  by  construction,  Cy  always  spe¬ 
cializes  C.J..  An  implies  link  links  CT  to  Cy,  and  has  the 
meaning  Vx[Cj(i)  — ►  CJifc)\.  In  other  words,  CT  implies 

cu- 

Within  a  UBox  concept,  contingent  restrictions  and 
constraints  are  merged  into  a  single  definition,  and  are 
classified  according  to  that  definition.  Suppose,  for  ex¬ 
ample,  that  we  made  the  following  declarations: 

(defconcept  C  ( irestrlction  R  (;min  i)) 

(: axioms  (: restriction  S  (:mln  1)))) 

(defconcept  D  (: restriction  R  (:min  1)) 
(•.restriction  S  (:mln  1))) 


^Browsers  of  Loom  knowledge  bases  should  be  aware  of  the  follow¬ 
ing:  Loom  maintains  separate  name  spaces  for  TBox  objects  and 
UBox  objects.  In  the  TBox  name  space,  only  TBox  objects  are 
visible,  and  C-p  has  the  name  "C".  In  the  IT3ox,  only  LT3ox  objects 
are  visible,  and  Cy  has  the  name  "C" 
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The  classifier  cannot  distinguish  between  the  objects  Cy 
and  D.y,  and  hence  will  merge  these  two  concepts. 

Implications  are  modeled  as  follows:  Suppose  we 
declare 

(defconcept  A  .  . 

(axioms  (implies  B))) 

Rather  than  placing,  say,  an  "implies"  link  between 
and  By,  Loom  captures  the  semantics  of  the  implication 
axiom  by  merging  all  of  the  knowledge  in  By  into  Ay  (in 
effect,  "compiling  out"  the  "implies"  link).  Equivalence 
relations  add  nothing  new  to  the  model,  since  they  just 
consist  of  cycles  of  implication  relations.  If  we  declared 
that  "A  implies  B",  and  also  that  "B  implies  A",  Loom 
would  merge  By  into  Ay,  and  would  also  merge  Ay  into 
By,  making  Ay  and  By  identical.  The  classifier  would 
then  merge  these  into  a  single  UBox  concept. 

Loom’s  internal  model  of  three  of  our  original  four 
categories  of  universal  knowledge  can  thus  be  ac¬ 
complished  with  the  addition  of  only  one  new  link,  the 
"implies"  link.10  An  important  property  of  the  model  is 
that,  in  all  cases,  the  "implies"  links  connect  more 
general  concepts  to  more  specific  ones:  The  Loom  (and 
NIKL)  TBox  classifiers  operate  by  picking  an  initial  set 
of  "starting  points"  (concepts)  and  then  traversing  down 
"subC"  links  which  connect  each  concept  to  those  eo;  - 
cepir  w'hicii  brectly  specialize  it.  Loot:  s  URvV  classifier 
traverses  down  both  "sub'  "  arid  "implies"  links.  Be¬ 
cause  the  "subC"  and  "implies"  links  form  an  acyclic 
directed  graph,  termination  of  the  UBox  classifier  is 
guaranteed. 

During  the  process  of  classifying/recognizing  an  ob¬ 
ject  X  in  the  UBox,  the  traversal  of  an  "implies"  link  can 
cause  knowledge  to  be  acquired  about  X  which  is  not  en¬ 
tailed  by  its  definition.  This  is  the  source  of  the  "non¬ 
stability"  in  the  1  'Box  classifier.  Recall  the  example  in 

10The  fourth  category  "other"  is  handled  by  special-purpose  data 
structures  and  algorithms  which  are  outside  of  the  scope  of  this  dis¬ 
cussion. 


section  5.2  which  traced  the  recognition  of  the  object 
"BPV".  One  of  the  algorithm’s  starting  points  is  the 
concept  2-Person-Vehicle.  If  we  visit  its  child 
2-Person-Car  and  make  the  test  (2-Person-Car  X)  before 
having  traversed  the  "implies"  link  between 
Battery-Powered-Vehicle^.  and 

Battery-Powered-Vehlcley,  we  would  receive  a  negative 
answer.  Traversing  that  link  causes  us  to  acquire  the 
knowledge  (Car  BPV).  After  this  point,  the  test 
(2-Person-Car  X)  returns  in  the  affirmative.  Hence,  the 
first  test  to  see  if  X  was  a  2-person  car  represented 
wasted  effort. 

One  practical  consequence  of  non-stability  is  that 
the  ordering  of  subsumption  tests  is  more  critical  for 
UBox  classification  than  for  TBox  classification.  Further¬ 
more,  it  is  not  always  the  case  that  careful  ordering  of 
subsumption  tests  can  avoid  the  necessity  to  repeat  some 
subsumption  tests  (unless  you  have  an  "oracle"  at  your 
disposal).  Theoretically,  UBox  classification  could  be  sig¬ 
nificantly  slower  than  TBox  classification.  We  have  not 
yet  performed  empirical  tests  which  compare  the  relative 
performance  of  the  two  algorithms,  but  we  expect  that 
we  will  be  able  to  achieve  reasonable  performance  from 
the  UBox. 

6.  Default  Knowledge 

Loom  establishes  a  separate  "box"  for  representing 
"default  knowledge"  --  knowledge  representing  state¬ 
ments  that  are  "typically"  true,  but  which  are  not 
axiomatic.  Conceptually,  this  default  knowledge  consists 
of  rules  of  the  form  "if  nothing  has  been  asserted  or 
deduced  which  contradicts  X,  then  assume  X". 

We  will  first  discuss  why  the  Loom  architecture  in¬ 
cludes  a  Default  Box.  Then  we  will  examine  the  seman¬ 
tics  of  the  default  value  and  closcd-world-assumption 
constructs.  Finally,  we  will  preview  what  the  operation 
of  a  non-monotonic  classifier  might  look  like. 
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6.1.  The  Case  for  a  Default  Box 

We  reject  the  idea  of  combining  assertional  and 
default  knowledge  into  a  single  "non- monotonic  ABox". 
Such  a  strategy  would  contradict  a  philosophical  goal  of 
the  Loom  architecture:  We  wish  to  reserve  the  ABox  for 
statements  about  individuals,  and  to  extend  the 
representational  power  of  the  non-ABox  portion  of  the 
system  so  that  all  statements  about  "classes  of 
individuals"  can  be  represented  somewhere  else  other 
than  in  the  ABox.  The  nature  of  default  knowledge  is 
that  it  generally  makes  statements  about  classes  of  in¬ 
dividuals.  Thus,  we  must  consider  what  the  implications 
are  of  developing  yet  another  box. 

The  prerequisites  for  defining  a  new  "box"  in  the 
Loom  knowledge  representation  framework  are  that  (i) 
we  can  identify  a  significant  body  of  knowledge  which 
would  be  assigned  to  that  box,  and  (ii)  a  specialized 
reasoning  facility  must  exist  to  process  the  inferences  as¬ 
sociated  with  this  knowledge.  The  Loom  system  does  not 
yet  meet  these  requirements,  because  it  is  able  to  respond 
to  only  two  very  specialized  forms  of  default  knowledge  -- 
it  includes  a  limited  treatment  of  default  values,  and  it 
recognizes  certain  closed-world  assumptions.  On  the 
other  hand,  we  already  have  some  idea  of  what  a  (much 
more  general)  non-monotonic  classifier  would  look  like. 
Its  behavior  is  sketched  below,  in  section  6.3.  Therefore, 
we  anticipate  that  both  prerequisites  will  be  met  in  a  fu¬ 
ture  version  of  Loom. 

6.2.  Default  Values  and 

Closed-World  Assumptions 

A  "default  value"  is  a  value  which  is  assigned  to  fill 
a  role/slot  for  some  individual  in  the  absence  of  any 
explicitly-asserted  (or  derived)  knowledge  about  that  role 
filler.  For  example,  in  our  Engines  and  Cars  KB,  the 
form 

(defaults  (restriction  type-of-fuel 

(:vr  Gasoline))) 


in  the  defconcept,  declaration  for 

Internal-Combustion-Engine  declares  that  Gasoline  is  the 
default  value  for  the  role  type-of-fuel.  If  for  some  con¬ 
stant  "x"  we  have  asserted 

(Internal -Combustion-Engine  x)  ,  and  we  have  made  no 
assertions  of  tue  form  (type-of-fuel  x  f)  ,  then  the 
default  assumption  is  (type-of-fuel  x  Gasoline)  . 

The  act  of  assigning  a  default  value  can  trigger  a 
re-classification  of  an  ABox  object.  For  example,  after 
making  the  assertion  (assert  Elephant  El),  the  process 
of  classifying  El  as  an  elephant  could  trigger  a  default 
assertion  color  El  Grey,  which  might  then  cause  El  to  be 
re-classified  as  a  grey-elephant  (if  such  a  concept  existed). 
We  have  yet  to  investigate  whether  default  values  may 
trigger  cycles  of  reclassifications,  and,  if  so,  how  the 
semantics  of  assigning  default  values  should  be  restricted 
to  prevent  such  cycles. 

The  Loom  representation  of  closed-world  assump¬ 
tions  is  another  example  where  we  can  elicit  useful 
default  behavior  in  the  absence  of  a  general-purpose  non¬ 
monotonic  classifier.  Each  ABox  knowledge  base  is  as¬ 
sumed  to  have  either  a  "closed-world"  or  an  "open- 
world"  interpretation.  "Open-world"  means  that  in  ad¬ 
dition  to  the  assertions  about  an  individual  that  are  ex¬ 
plicitly  stated  in  the  knowledge  base,  there  may  be  other 
relevant  assertions  which  have  been  left  unstated.  For 
example,  consider  the  Engines  and  Cars  KB  once  again. 
Suppose  we  make  the  assertions 

(assert  (Internal-Combustlon-Englne  e) 

(Cylinder  cl)  (Cylinder  c2) 

(Cylinder  c3)  (Cylinder  c4) 

(cylinder  e  cl)  (cylinder  e  c2) 

(cylinder  e  c3)  (cylinder  e  c4)) 

Can  we  deduce  (4-Cylinder-Englne  e)  ?  The  answer  is 
no  if  we  adopt  an  open-world  assumption,  because  the 
possibility  exists  that  there  arc  4  (or  12,  or  whatever) 
more  cylinders  which  are  also  components  of  the  engine 
"e".  On  the  other  hand,  adopting  a  closed-world  as¬ 
sumption  would  allow  us  to  conclude  that  the  four 
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cylinders  which  are  components  of  "e"  are  the  jnly  ones 
that  exist,  in  which  case  the  inference 
(4-Cylinder-Engine  e)  is  valid. 

Loom  allows  one  to  declare  selective  "regions"  of 
closed-world  semantics  within  an  open-world  knowledge 

base:  The  declaration 

(defrelaiion  M  . . . 

( : axioms  ( : domain  D) ) 

( : defaults  . closed-world-assumption) ) 

has  the  following  interpretation:  "If  "D(x)M  has  been  as¬ 
serted  (or  can  be  deduced)  for  some  x,  then  for  all  y, 
"M(x,  y)"  is  true  only  if  it  has  been  explicitly  asserted,  or 
can  be  derived."  The  def relation  declaration  for  the 
relation  cylinder  in  the  Engines  and  Cars  KB  includes 
such  a  closed-world  assumption.  This  assumption  allows 
the  classifier  to  count  instances  of  the  cylinder  relation 
when  attempting  to  recognize  an  object  as  an  instance  of 
the  concept  4-Cylitder-Engine. 

6  Preview  of  a  Non-Monotonic  Classifier 

A  non-monotonic  classifier  has  not  yet  been 
developed  for  the  Loom  architecture.  We  provide  here  a 
preview  of  what  its  behavior  will  be  like  if  and  when  it  is 
constructed,  with  the  intention  of  stimulating  the 
demand  for  such  a  reasoner.  Our  example  provides  an  il¬ 
lustration  of  how  a  classic  problem  in  non-monotonic 
reasoning  can  be  modeled  by  the  Loom  language. 

In  the  process  of  classifying/recognizing  an  object 
"x",  a  non-monotonic  classifier  will  reference  both  ex¬ 
plicitly  declared  knowledge  and  default  knowledge  about 
"x",  and  hence  may  deduce  classifications  which  are 
based  on  default  assumptions.  As  is  the  case  with  UBox 
classification,  the  classifier  may  acquire  additional  infor¬ 
mation  about  "x"  in  the  midst  of  the  classification 
process.  The  possibility  arises  that  the  "acquired" 
knowledge  will  contradict  one  (or  more)  of  the  default  as¬ 
sumptions.  In  this  case,  the  classifier  must  retract  any 
classifications  it  has  already  made  which  were  based  on 
these  non-valid  assumptions. 


Consider  the  Birds  KB  in  Figure  A-5.  Suppose  we 
have  made  the  assertion 
(assert  (Penguin  Tweety)) 

The  classifier  may  first  deduce  (Bird  Tweety)  ,  then 
pick-up  the  attached  default  implication  and  assume 
(Flying-Animal  Tweety)  ,  and  then  deduce 

(Flying-Bird  Tweety)  .  Next,  it  may  deduce 
(Xon-Flylng- Animal  Tweety)  from  the  definition  of 
Penguin,  and  then  discover  that  Flying-Animal  and 
Non-Flying-Animal  are  disjoint.  At  this  point,  it  must 
retract  the  earlier  deductions  (Flying-Animal  Tweety) 
and  (Flying-Bird  Tweety)  . 

7.  Conclusion 

The  Loom  language  introduces  new  expressivity  and 
some  new  and  powerful  forms  of  inference  into  the  KL- 
ONE  paradigm  for  knowledge  representation.  The  most 
significant  achievement  is  the  formulation  of  the  UBox, 
which  allows  universal  knowledge  to  be  defined  and 
reasoned  about  independently  of  the  terminological 
knowledge.  The  UBox  solves  a  long-standing  problem  of 
how  to  represent  necessary  and  sufficient  conditions,  and 
provides  a  way  for  a  user  to  introduce  cyclic  references 
into  a  knowledge  base  without  derailing  the  classifies 

Looking  towards  the  future,  we  ha\e  described  the 
behavior  of  a  Default  Box,  indicating  how  a  classifier 
might  be  extended  to  perform  non-monotonic  classifica¬ 
tions.  Collectively,  our  results  suggest  that  we  have 
taken  another  step  in  an  ongoing  evolution  of  knowledge 
representation  systems,  wherein  increasing  numbers  of 
specialized  forms  of  reasoning  can  be  organized  within  a 
principled  knowledge  representation  framework. 
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A.  Knowledge  Bases 


Engines  and  Cars  Knowledge  Base 

(def relation  lias-component  primitive  (  inverse-of  component-of)) 

(def relation  component-of  primitive) 

(defconcept  Horse-Power  primitive) 

(def relation  horse-power  (range  Horse-Power)) 

(defconcept  Fuel  primitive) 

(dt  frelation  type-of-fuel  (range  Fuel)) 

(defconcept  Gasoline  :primitlve  (  specializes  Fuel)) 

(defconcept  Diesel-011  :pr1 mitive  ( :specializes  Fuel)) 

; ; ;  Engines 

(defconcept  Engine  : primitive 

(raxioms  ( :restriction  type-of-fuel  (:number  i)) 

(: restriction  horse- power  (.number  1)))) 

(defconcept  Cylinder  : primitive) 

(def relation  cylinders  (: special!  zes  has-compontnt)  (: range  Cylinder) 

( : defaults  : closed-world-assumption)) 

(def concent  Internal-Combustion-Engine  (: specializes  Engine) 

(: restriction  cylinders  (:min  i)) 

( : defaults 

(: restriction  type-of-fuel  (:vr  Gasoline)))) 

(defconcept  4-Cyllnder-Engine  (: specializes  Engine) 

(: restriction  cylinders  (-.number  4))) 

; ; ;  Diesel-Engines 

(defconcept  Glow-Plug  -.primitive) 

(def relation  compre  sion- ratio  .primitive 

(: axioms  (: domain  Internal-Combustion-Engine)  (.range  Integer))) 

(defconcept  Diesel-Oil-Engine  (: specializes  Engine) 

(;restrlction  type-of-fuel  (:vr  Diesel-Oil)) 

(: axioms 

(■.implies  Diesel-Engine))) 

(defconcept  Thlng-With-Glow-Plugs 

(: restriction  (-.vrdiff  has-component  Glow-Plug)  (:mi«  1)) 

( : axioms 

(: implies  Diesel-Engine))) 

(defconcept  Very-High-Compres^lon- Engine 

(: constraint  greater-than  (compression-ratio)  15) 

( ■ axioms 

(: implies  Diesel-Engine))) 

(defconcept  Diesel-Engine  : primitive 

(specializes  Internal-Combustion-Engine  Diesel-Oil-Engine 

Thing-With-Glow-Plugs  Very-High-Compression-Engine)) 
(defconcept  Battery-powered-Engine  -.primitive  (: specializes  Engine)) 

; ,  ;  Cars 

(defconcept  Vehicle  primitive) 

(defconcept  Motor-Vehicle  (: specializes  Vehicle) 

(-.restriction  (:vrdiff  has-component  Engine)  (: number  1))) 

(defconcept  Battery-Powered-Vehicle  (: specializes  Motor-Vehicle) 

(: restriction  (:vrdiff  has-component  Engine)  (:vr  Battery-Powered-Engine)) 
(-.axioms  (: implies  Car))) 

(defrelatlon  occupants  :primitive  (range  Human)) 

(defconcept  2-Person-Vehicle  (  specializes  Vehicle) 

(: restriction  occupants  (max  2))) 

(defconcept  Car  primitive  (.specializes  Vehicle) 

(-.axioms 

(: implies  Motor-Vehicle))) 

(defconcept  2-Person  Car  (specializes  Car  2-Person  Vehicle)) 


Figure  A-l:  Engines  and  Cars 
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Numeric  Comparison  Knowledge  Bases 


;  ; ;  Numeric  Comparison  Predicates 

(defrelation  numeric-comparison  :primitive  (: specializes  Compute-Relation) 

C  axioms 

(  domain  Real-Number)  ('.range  Real-Number) 

(: covering  greater-or-equal  less-or-equal))) 

(defrelation  greater-than  primitive  (: specializes  greater-or-equal  not-equal) 

( : annotation 

(membership-test  (lambda  (domain  range)  (>  domain  range))))) 

(defrelation  less-than  : primitive  ('.specializes  less-or-equal  not-equal) 

(  annotation 

(membership-test  (lambda  (domain  range)  (<  domain  range))))) 

(defrelation  equal  : primitive  (: specializes  greater-or-equal  less-or-equal) 

( : annotation 

(: membership-test  (lambda  (domain  range)  (eql  domain  range))))) 

(defrelation  not-equal  iprlmitive  (: specializes  numeric-comparison) 

( ; axioms 

( :dis]olnt-covering  greater-than  less-than))) 

(defrelation  greater-or-equal  : primitive  ('.specializes  numeric-comparison) 

( : axioms 

(  disjoint-covering  equal  greater-than))) 

(defrelation  less-or-equal  iprlmitive  (: specializes  numeric-comparison) 

( : axioms 

(idisjo  nt-covering  equal  less-than))) 

Figure  A-2:  Numeric  Comparison 

Sets  and  Intervals  Knowledge  Base 

; ; ;  Sex 

(def concept  Animal  primitive) 

(defset  Sex  (: values  Male  Female)  ('.partitions  Animal)) 

; ; ;  Navy  Rankings 

(def concept  Navy-Person  : primitive) 

(def concept  Military-Rank  ’.primitive) 

(defrelation  Rank  (:range  Military-Rank)) 

(defrelation  Naval -Rank  : primitive  (.specializes  Rank) 

(: axioms  (: domain  Navy-Person)  (: range  Naval-Rank))) 

(def interval  Naval -Rank  : primitive  (: specializes  Military-Rank) 

(lvalues  Seaman-Recruit  Seaman-Apprentice  Seaman  Petty-Df f icer-Thli d-Clnss 

Petty-Officer-Second -Class  Petty-Off icer-Flrst-Class  Chief-Petty-Of fleer 
Senior-Chief -Petty-Of fleer  Master-Chief -Petty-Off leer 
Ensign  Lieutenant- Junior-Grade  Lieutenant  Lieutenant-Commander 
Commander  Captain  Commodore  Rear-Admiral  Vice-Admiral  Admiral) 
(ipartltions  Navy-Person  ( : suffix  nil))) 

(defset  Naval-Officer-Rank  (: specializes  Naval-Rank)  (lvalues  [Ensign . .Admiral] )) 

; ; ;  Numbers  , 

(def concept  Real-Number  : primitive 
( : annotation 

(: membership-test  (lambda  (self)  (numberp  self))))) 

(deflnterval  Integer  iprlmltlve  (: specializes  Real-Number) 

(lvalues  [-INFINITY.  INFINITY]) 

(  annotation 

(.membership-test  (lambda  (self)  (lntegerp  self))) 

( :predecessor-fn  (lambda  (self)  (1-  self))) 

( :  successor-fn  (lambda  (self)  (1+  self))))) 

(deflnterval  Natural -Number  (: specializes  Integer)  (lvalues  [0 .. INFINITY] )) 
(deflnterval  Positive-Integer  (: specializes  Integer)  (lvalues  [1 .. INFINITY] )) 
(deflnterval  Non-Negative-Integer  ( :  soeclallzes  Integer) 

(lvalues  [-INFINITY..-!]  [1  ..  INFINITY] )) 


Figure  A-3:  Sets  and  Intervals 


Familial  Relations  Knowledge  Base 


; ; ,  Person 

(defconcept  Person  :primitive) 

; ; ;  Familial  Relations 

(def relation  parent  : primitive 

(  axioms  (: domain  Person)  (: range  Person))) 

(def relation  father  (: specializes  parent)  (: range  Male)) 

(def relation  grandparent  ( : composltlon-of  parent  parent)) 

(def relation  grandfather  (: composltlon-of  parent  father)) 

(def relation  ancestor  (:closure-of  parent)) 

(def relation  child  (:inverse-of  parent)) 

(def relation  sibling  (: composltlon-of  parent  child)  (: specializes  not-equal)) 
(def relation  brother  (; specializes  sibling)  (: range  male)) 

/ 

Figure  A-4:  Familial  Relations 


Birds  Knowledge  Base 


(defconcept  Animal  : primitive 

(: axioms  (: disjoint-covering  Flying-Animal  Non-Flying-Animal))) 
(defconcept  Flying-Animal  •primitive  (: specializes  Animal)) 
(defconcept  Non-Flying-Animal  : primitive  (: specializes  Animal)) 

(defconcept  Bird  primitive  (specializes  Animal) 

(defaults  (: Implies  Flying-Animal) )) 

(defconcept  Flying-Bird  (: specializes  Bird  Flying-Animal)) 

(defconcept  Penguin  : primitive  (: specializes  Bird  Non-Flying-Animal)) 


Figure  A-5:  Birds 


B.  Semantics  of  Term-Defining  Constructs 


Loom  Expression, 
c 

(defconcept  ('.specializes  Cj  C2)) 

(defconcept  (:restrietion  M  (vr  C))) 
(defconcept  (:restriction  M  (min  n))) 
(defconcept  (:restrietion  M  (max  n ))) 
(defconcept  (:constraint  CR  (R[  R2)  (Sj  S2))) 

(defconcept  (-.constraint  CR  (Rj  R2)  v)) 
(defrelation  (:specializes  M2)) 

(defrelation  (-.range  C)) 

(defrelation  (:inversc-of  M)) 

(defrelation  (:closure-of  M)) 

(defrelation  (:composition-of  M2)) 


Semantics  of  e, 

He]) 

Xx.  [[C^Kx)  A  HC2]](x) 

Xx.  Vy(([Mjj("'  v)  —  T[C]](y)) 

Xx,  3  n  distinct  y ■  A,-  |[M]](x,  !/,•) 

Xx.  f  n+1  distinct  y ■  A,- l[M]](x,  y(-) 

Xx.  Vy,r  (KRiHIRjM*,  y)  A 

1!S1]]o|1S21](x,  ,))  -  |[CR]](y,  *)) 

Xx.  Vy([[R1]]o[[R2]](x,y)  -  [|CR]](y,  v)) 

Xx,y.  [[MjlKx,  y)  A  [|M2]](x,  y) 

Xi,y-  llC]](y) 

Xx,y.  [|M]](y,  x) 

Xx,y.  [jM]]+(x,  y) 

Xx,y.  l[M1]]°HM2]](x,  y) 
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A  Framework  for  Situation  Assessment: 
Using  Best-Explanation  Reasoning 
To  Infer  Plans  from  Behavior 

John  R.  Josephson 

The  Ohio  State  University 
Columbus,  Ohio 


Abstract 

We  propose  a  computational  framework  for  bat¬ 
tlefield  situation  assessment,  describing  how  the 
relevant  knowledge  can  be  organized,  represented,  and 
used  in  the  service  of  problem  solving.  We  describe 
how  the  reasoning  processes  can  be  controlled  to  avoid 
excessive  and  impractical  amounts  of  search,  and  in¬ 
dicate  how  the  computations  can  be  distributed  in  a 
natural  way  to  spread  the  burden  over  a  community 
of  separate  processors  and  processing  sites.  This 
design  extends  previous  work  done  at  The  Ohio  State 
LAIR  on  diagnostic  reasoning  and  representation  of 
plan  understanding,  and  applies  it  to  the  particular 
military  information-processing  problem,  a  species  of 
the  more  general  intellectual  task  of  inferring  plans 
and  intentions  from  behavior. 

Introduction 

Nowhere  is  there  a  more  compelling  need  for  an  AI 
system,  than  to  multiply  the  effectiveness  of  the  forces 
defending  Western  Europe  against  possible  aggression 
from  the  Eastern  Block.  As  this  report  is  written  the 
United  States  and  the  Soviet  Union  are  actively 
engaged  in  negotiations  whose  avowed  object  is  the 
elimination  of  medium-rangi  nuclear  weapons  from 
Europe.  Yet  it  has  become  increasingly  clear  that 
that  the  prospects  for  a  more  general  nuclear  dis¬ 
armament  are  severely  limited  by  the  NATO  allies’ 
perceived  need  to  rely  on  tactical  nuclear  weapons  to 
defend  Western  Europe  against  the  numerically  supe¬ 
rior  forces  of  the  Soviet  Union  and  its  allies.  Thus 
any  technological  innovation  that  can  contribute  to 
multiplying  the  non-nuclear  defensive  effectiveness  of 
the  western  allies  (or  at  least  to  their  apparent 
effectiveness)  can  contribute  markedly  to  nuclear  dis¬ 
armament  at  the  low  end  of  weapons  yield,  and  thus 
to  breaking  the  path  of  escalation  from  convention  to 
nuclear  war.  Equipping  NATO  field  commanders  with 
AI  systems  that  enhance  their  ability  to  respond 
quickly  and  cleverly  to  developments  on  the  battlefield 
would  constitute  just  such  a  technological  innovation. 


This  paper  proposes  a  computational  framework  for 
battlefield  situation  assessment,  the  task  of  inferring 
the  plans  and  objectives  of  adversaries  and  other 
players  on  the  battlefield,  we  describe  how  a  number 
of  the  important  kinds  of  knowledge  needed  for  the 
task  can  be  organized,  represented,  and  brought  to 
bear  at  the  right  times  to  contribute  to  the  problem¬ 
solving.  Several  distinct  but  interacting  types  of 
reasoning  are  needed,  including:  abduction  or  “best 
explanation”  reasoning;  planning  (for  the  other  guy) 
including  route  planning,  resource  allocation  planning , 
tactical  goal  choosing,  plan  schema  instantiation ;  and 
also  map-based  spatial  reasoning  about  proximity, 
avenues  of  approach,  formation,  striking  ranges  of 
weapons,  importance  of  various  terrain  features,  and 
the  like.  The  foregoing  is  not  a  complete  list,  and  the 
list  is  heterogeneous  with  respect  to  level  of  abstrac¬ 
tion.  For  example  abduction,  more  specifically,  the 
assembly  of  composite  explanatory  hypotheses,  is  needed 
for  the  task  of  “diagnosing”  enemy  plans  by  trying  to 
produce  coherent  composite  explanations  of  his  be¬ 
havior.  On  our  view  classification  reasoning  is  also 
needed  in  the  service  of  plan  diagnosis,  to  organize  the 
reasoning  processes  whereby  precompiled  high-level 
plan  schemata  are  recognized  as  plausibly  useful  for 
the  interpretations  of  certain  actions.  That  is,  clas¬ 
sification  is  needed  for  plan  recognition. 

Throughout  the  design  process  we  have  been  con¬ 
cerned  to  structure  the  control  strategies  to  avoid  ex¬ 
cessive  and  impractical  amounts  of  search.  Search 
processes  in  this  domain  could  easily  get  out  of  hand, 
and  a  formal  definition  of  the  problem  space  would 
show  that  the  space  is  rnulti-diinensioually  roin- 
binatorially  explosive.1  In  order  to  control  search  there 
seems  to  be  no  escape  from  the  necessity  of 


^The  hypothesis  composition  problem  alone  is  explosive  in  at 
least  two  dimensions  1 


decomposing  the  problem  into  manageable  subtasks, 
chunking  the  knowledge  base  into  meaningful  and 
moderately  sized  units,  organizing  the  knowledge  for 
use  by  the  problem  solving  processes,  and  modularizing 
the  reasoning  so  that  reasoning  strategies  can  be 
tailored  to  the  reasoning  tasks.  In  short,  the  answer 
to  how  to  avoid  mpractical  amounts  of  search,  is  that 
we  must  at  all  costs  avoid  or  control  complexity. 
Since  the  problem  set  for  the  system  is  by  its  nature 
complex,  our  recourse  must  be  to  redefine  the  problem 
wherever  it  makes  sense  to  do  so,  and  modularize, 
modulauze,  modularize."  Moreover  not  just  any 
modularization  will  do.  Modules  must  perform  mean¬ 
ingful  and  accomplishable  functions  in  the  system,  and 
be  able  to  act,  semi-autonoinously  without  explosive 
amounts  of  interaction,  or  they  will  not  really  con¬ 
tribute  to  controlling  the  complexity. 

It  appears  that  one  nice  effect  of  this  extreme 
modularity  of  the  problem  solving  is  that  parallel  and 
distributed  implementations  of  the  design  are  possible, 
not  only  taking  advantage  of  the  computing  power  of 
multiprocessor  architectures,  but  also  distributing 
responsibility  for  portions  of  the  problem,  in  a  natural 
fashion,  to  geographically  distributed  processing  sites. 
In  short  we  suggest  giving  ~ach  local  commander,  at 
each  level  of  the  organization,  his  own  semi- 
autonomous  decision  support  situation  assessment  sys¬ 
tem.  The  individual  systems,  through  intermittent 
communication  up,  down,  and  locally  sideways  through 
the  command  hierarchy,  will  integrate  into  a  larger 
system,  itself  more  than  the  sum  of  its  separate  parts. 

'[’his  design  extends  previous  work  done  at  The  Ohio 
State  LAIR,  on  diagnostic  reasoning  4,  12,  16|  and  on 
representing  plan  understanding  14,  5],  and  applies  it 
to  the  particular  military  information-processing 
problem,  a  species  of  the  more  general  intellectual  task 
of  inferring  plans  and  intentions  from  behavior. 

In  the  terms  of  this  paper  situation  assessment  is 
distinguished  from  data  fusion  as  a  distinct  stage  of 
military  information  processing.  The  data  fusion 
process  takes  raw  intelligence  reports  and  produces  a 
description  of  tin*  battlefield  situation  in  terms  of 
various  actors,  at  various  scales  of  discrimination, 
with  information  about  their  identities,  types,  actions, 
locations,  and  motions.  The  fusion  process  takes  care 
of  counting  and  classifying  the  various  actors,  tracking 
them  over  time,  unifying  reports  about  the  same  actor 
corning  in  at  different,  times,  attributing  actions  to  the 
correct  actor,  maintaining  descriptions  of  actor  slates, 
and  so  on  In  summary,  the  output  of  the  fusion 
process  is  a  description  of  the  actors  and  their 
‘‘observed  behavior 


"Sre  Herbert  Simon’s  essay  on  l  lie  Aichitecture  ot 
Complexity  18  for  an  explanation  of  why  hierarchical  (leconi 
position  is  such  a  useful  and  general  strategy. 


We  can  think  of  this  fusion  output  as  being 
presented  on  a  map  board  which  is  constantly  updated 
to  reflect  the  most  recent  conclusions.  On  this  board 
appear  symbols  representing  the  actors  and  their  loca¬ 
tions.  We  can  imagine  that  the  board  is  automated 
to  allow  us  to  zoom  in  and  out  and  examine  the 
situation  at  various  granularities  of  resolution.  Each 
symbol  on  the  board  packages  a  rich  data  structure 
containing  not  just  identity,  tracking  history,  recent 
behav’jr,  classification  of  the  actor  as  to  type  (e.g.  ar¬ 
tillery  battalion),  pointers  to  its  parts,  and  so  on;  but 
beyond  that,  each  actor  symbol  on  the  board  indexes 
directly  or  indirectly  into  everything  that  is  known  (or 
surmised)  about  that  actor.  Uncertainties  are 
represented  explicitly,  (e.g.  that  this  is  probably  the 
33rd  rifle  brigade,  but  it  might  be  the  44th.)  To 
keep  things  manageable,  the  number  of  alternative 
values  for  a  given  attribute  should  be  kept  small,  on 
the  order  of  a  best  estimate  plus  one,  or  at  most  two, 
alternative  values.  While  most  of  the  information  flow 
will  be  from  the  fusion  process,  through  the  board,  to 
the  situation  assessment  process,  we  can  allow  for  the 
possibility  of  some  information  flowing  the  other  way. 
For  example  the  fusion  process  might  judge  that  a 
unit  is  probably  of  type  A,  with  type  B  as  an  avail¬ 
able  alternative  interpretation;  then  subsequent  situa¬ 
tion  assessment  reasoning  may  find  that  it  would 
make  no  coherent  sense  for  the  unit  to  be  of  type  A, 
but  that  it  does  make  sense  for  it  to  be  of  type  B, 
and  this  information  can  be  passed  back  to  the  board. 

The  fusion  process  is  itself  a  difficult  problem  for  an 
A1  system  to  address,  but  taking  it.  for  granted 
anyway  (it  might  be  automated  or  not)  this  report 
describes  a  system  that  will  take  the  output  from  the 
fusion  process,  and  use  it  to  infer  an  adversary’s  plans 
and  intentions.  Thus  the  function  of  the  situation  as¬ 
sessment  process  can  be  described  as  “plan  diagnosis”. 
Since  the  input,  to  the  assessment  process  is  infor¬ 
mation  about  aclors  and  their  behavior,  and  the  out¬ 
put  can  be  thought  of  as  a  coherent  explanation  of 
that  behavior,  we  ran  see  the  whole  process  as  being 
a  form  of  “best  explanation”  reasoning  or  abduction. 

“Abduction" 

Abduction  or  Inference  to  the  Best 
Explanation  is  a  form  of  inference  that  follows  a 
pattern  something  like  this: 

I)  is  a  collection  of  data  (facts,  obser¬ 
vations.  givens), 

H  explains  I)  (would,  if  true,  explain 

D). 

No  other  hypothesis  explains  I)  as  well 
as  11  does. 


Therefore,  II  is  (probably)  correct. 
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The  strength  of  an  abductive  conclusion  will  in 
general  depend  on  several  factors,  inc’uding: 

•  how  good  H  is  by  itself,  independently  of 
considering  the  alternatives, 

•  how  decisively  H  surpasses  the  alternatives, 

•  how  thorough  the  search  was  for  alternative 
explanations,  and 

•  pragmatic  considerations,  including 

o  the  costs  of  being  wrong  and  the 
benefits  of  being  right, 

o  how  strong  the  need  is  to  come  to  a 
conclusion  at  all,  especially  considering 
the  possibility  of  seeking  further 
evidence  before  concluding. 

Abductions,  as  we  have  just  characterized  them,  go 
from  data  describing  something  to  an  explanatory 
hypothesis  that  best  accounts  for  that  data. 

Notice  that  calling  an  inference  “abduction”  carries 
with  it  the  idea  of  its  goal:  a  best  explanation.  Con¬ 
trast  this  with  characterizing  an  inference  as 
“deduction”,  which  carries  instead  the  idea  of  a  con¬ 
straint  that  is  satisfied:  that  the  inference  is 
guaranteed  to  be  truth-preserving.  Since  there  is  no 
intrinsic  incompatibility  between  explanatory  goals  and 
truth-preservation  constraints,  it  is  conceivable  for 
there  to  be  deductive  abductions.  In  fact,  if  all  of  the 
alternative  ways  of  explaining  something  are  exhaus¬ 
tively  enumerated,  and  all  but  one  of  the  explanations 
are  decisively  eliminated,  the  overall  pattern  of  in¬ 
ference  is  deductively  valid. 

Arguably  abduction  is  itself  ari  epistemologically  fun¬ 
damental  form  of  reasoning,  not  reducible  to  deduc¬ 
tion,  probabilistic  induction,  or  any  combination  of 
them  [9,  10).  But  whether  or  not  abductions  can  be 
justified  on  logical  grounds,  they  appear  to  be  ubiqui¬ 
tous  in  the  un-selfconscious  reasonings,  interpretations, 
and  perceivings  of  ordinary  life,  and  in  the  more  criti¬ 
cally  self  aware  reasonings  upon  which  scientific 
theories  are  based  10  . 

It  is  a  common  view  that  diagnostic  reasoning  in 
general  is  abduction  6.  lf>,  17, .  The  idea  is  that  the 
task  of  a  diagnostic  reasc.ner  is  to  come  up  with  a 
best  explanation  for  the  set  of  symptoms.  In  tliis 
paper  we  take  the  point  of  view  that  the  overall  plan- 
diagnosis  situation-assessment  task  is  best  understood 
as  a  form  of  abduction.0 

“Explanation" 

There  are  numerous  senses  of  the  term 
“explanation”  in  common  use:  several  are  relevant  to 
this  paper.  At.  one  extreme  we  may  speak  of  a  scien¬ 
tific  theory  explaining  some  physical  phonemenon,  as 
for  example,  how  Newton’s  Theory  of  Gravitation  ex¬ 


plains  the  tides.  In  related  senses  we  may  speak  of 
explaining  historical  events  by  economic  theories,  ex 
plaining  some  human  behavior  using  a  theory  of  mo¬ 
tives,  and  explaining  the  actions  of  some  administra¬ 
tive  unit  using  a  theory  of  institutional  goals  and  mis¬ 
sions.  In  each  of  these  examples  the  abstract  struc¬ 
ture,  the  “theory”,  explains  some  phenomenon  by  plac¬ 
ing  it  in  a  larger  context,  describing  important  things 
about  what  has  made  the  phenomenon  to  be  the  way 
it  is.  Usually  an  explaining  theory  is  some  sort  of 
conceptual  structure  which  presents  in  some  fashion 
the  “causes”  of  the  things  being  explained.  When  a 
theory  explains  by  describing  causes  and  causal 
relationships,  we  may  reasonably  speak  of  “causal 
explanation”.  The  senses  of  “explanation”  important 
for  this  paper  are  all  senses  of  “causal  explanation”. 

When  somebody  “explains”  something  to  somebody, 
giving  a  causal  explanation  of  some  physical  event  for 
example,  the  explainer  conveys  (more  or  less 
accurately)  a  theoiy,  a  conceptual  structure,  to  the 
explainee.  If  all  goes  well,  the  explainee  understands 
something  about  the  event  that  he/she  didn’t  under¬ 
stand  before.  We  may  say  that  to  understand  an 
event  (in  this  particular  sense)  is  to  grasp  a  causal  ex¬ 
planation  of  it 4  A  causal  explanation  is  a  structure- 
made  up  of  linked  concepts,  including  concepts  of  the 
event  and  of  it.s  supposed  causal  antecedents. 

Explaining  purposive  or  goal-directed  behavior  intro¬ 
duces  a  new  dimension  into  this  account  of  explana¬ 
tion.  The  existence  within  an  agent  of  a  goal  which 
influences  the  behavior  of  the  agent,  makes  the 
presence  of  that  goal  an  important  part  of  the  causal 
ancestry  of  the  behavior  it  influences.  Thus  to  ade¬ 
quately  explain  goal-directed  behavior  we  need  to  men¬ 
tion  the  goals  that  have  actively  influenced  the  be¬ 
havior.  Explanations  that  make  reference  to  goals  are 
usually  referred  to  as  teleological  explanations.  In  a 
situation  assessment  system,  the  explanations  given  for 
an  adversary’s  behavior  are  teleological  explanations. 


*Note  liow  abduction  was  used  to  justify  il-a'll  Imie! 

^“We  suppose  ourselves  to  possess  unqualified  scientific 
knowledge  of  a  tiling,  ...  ,  when  we  think  we  know  the  cause  on 
which  the  fact  depends,  as  the  cause  of  that  fact  and  no  other 

and,  further,  that  the  fact  could  not  be  other  than  it  is . 

What  I  now  assert  is  that  at  all  events  we  do  know  by 
demonstration.  By  demonstration  1  mean  a  syllogism  productive 
of  scientific  knowledge,  a  syllogism,  that  is,  the  grasp  of  which  is 
ei>  ipsa  such  knowledge.  Assuming  then  that  my  thesis  as  to  the 
nature  of  scientific  knowing  is  correct,  the  piemisses  of 
demonstrated  knowledge  must  be  true,  primary,  immediate,  better 
known  than  and  prior  to  the  conclusion,  width  is  further  related 
lo  lhrni  as  effect  to  cause.”  Aristotle  2 
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Intelligent  agents,  insofar  as  they  ARE  intelligent 
agents,  that  is  insofar  as  they  accomplish  smart 

thinking  as  opposed  to  stupid  thinking,  do  things  for 
good  reasons,  whether  they  know  it  or  not.  What 
makes  them  smart  is  a  sort  of  appropriateness  of  the 
thinking  processes  for  the  task  at  hand,  and  for  the 
way  the  world  is.  When  an  intelligent  agent  makes  a 
decision  by  explicitly  considering  reasons  for  making 
the  decision  one  way  or  another,  those  reasons  form  a 
significant  part  of  the  causal  ancestry  of  the  decision. 
For  intelligence,  reasons  are  causes.  Thus  when  we 

ask  an  intelligent  system  to  explain  its  reasoning 
processes  by  giving  reasons,  we  are  once  again  asking 
for  a  form  of  causal  explanation. 

Design  Considerations  for  Automated  Situation 
Assessment 

We  strongly  suggest  that  the  system  design  be 

oriented  towards  producing  a  “realistic  appraisal 
of  the  situation,  implying  that  the  system  should  have 
the  following  characteristics. 

•  The  system  should  keep  track  of  where  its 
interpretations  are  most  certain,  and  where 
they  are  less  so.  It  should  have  robust, 

common-sensical  behavior  in  where  it  places 
the  most  certainty,  and  use  these  places  as 
anchor  points  for  further  interence.  The 
system  shouldn’t  be  “flighty”  in  its  reason¬ 
ing,  uncritically  engaging  in  long  chains  of 
inference  without  anchor  points.  In  par¬ 
ticular  the  system  should  be  well  grounded 
in  the  hardest  evidence  about  the 

adversary’s  behavior;  hard  evidence  should 
not  be  ignorable.  The  system  should  be 

strongly  driven  to  explain  the  best  attested 
behavior  in  some  detail.  (The  best  attested 
behavior,  by  the  way,  will  not  just  include 
items  at  the  smallest  grainsize;  for  example 
sometimes  we  could  be  surer  that  there  is 
an  attack  going  on  than  about  what  the 
details  are.  Many  different  items  of  be¬ 
havior,  all  attesting  to  the  existence  of  a 
particular  adversary  plan,  could  in  principle 
render  that  plan  more  certain  than  any 
single  piece  of  evidence  for  it.)  The.  system 
should  seek  these  anchor  points,  infer  a 
step  or  two  beyond  them  into  the  realm  or 
educated  guessing,  but  go  no  further:  some 
behavior  should  be  left  uninterpreted  if 
necessary  to  avoid  unwarranted  speculation, 

•  The  system  should  attribute  only  reasonably 
“plausible”  plans  to  the  advesary.  litis 
will  require  appraising  the  feasibility  and 
utility  of  hypothesized  plans,  and  also  ap¬ 
praising  the  likelihood  of  these  plans  based 
on  conformity  to  known  characteristic  ways 
of  behaving. 


•  The  system  should  lean  neither  towards  op¬ 
timism  nor  pessimism  in  its  appraisal,  but 
should  aim  instead  towards  a  realistic 
balance  of  these  tendencies.  We  believe 
that  a  basic  system,  structured  to  be  realis¬ 
tic,  will  provide  a  firm  foundation  for  pes¬ 
simistic  or  optimistic  variations  on  the  basic 
problem  solving  that  might  be  intentionally 
engaged  in  for  special  purposes.  In  par¬ 
ticular  it  would  be  a  big  mistake  to  bias 
the  main  system  towards  pessimism  in  or¬ 
der  to  assist  with  “blunder  avoidance”  since 
this  would  introduce  the  new  danger  of 
blundering  by  overreacting  to  mere  pos¬ 
sibilities.  A  better  way  to  do  things  would 
be  to  maintain,  alongside  the  main  realistic 
situation  assessment,  a  worst-case  estimate 
based  primarily  on  capabilities. 

•  The  system  design  should  avoid  making  un¬ 
realistic  idealizing  assumptions  about  the 
adversary  such  as  attributing  to  him  perfect 
communications  and  coordination  of  actions, 
or  perfect  knowledge  of  environmental  con¬ 
ditions  or  friendly  force  deployment,  or  the 
ability  to  always  make  flawless  plans.  In 
short,  the  system  should  not,  in  any  simple 
way,  suppose  that  the  adversary  is  acting 
consistently. 

•  The  problem  solving  architecture  should 
support  a  system  which  strongly  “tends 
towards  the  right  answer”.  What  this 
means  is  clear  enough,  but  it  is  difficult  to 
state  what  this  implies  for  system  architec¬ 
ture.  It  implies  at  least  that  knowledge 
structures  should  be  redundant  so  that,  if 
the  system  can’t  figure  things  out  one  way, 
there  is  a  good  chance  that  it  will  do  it,  in 
another.  It  also  implies  that  problem  solv¬ 
ing  control  strategies  should  have  charac¬ 
teristics  such  that,  as  more  and  more  infor¬ 
mation  is  available,  in  principle  revealing 
the  adversary’s  plans  with  greater  and 
greater  clarity,  the  system  will  produce 
hypotheses  about  those  plans  with  greater 
and  greater  accuracy,  and  increasing  con¬ 
fidence. 

Some  other  desirable  system  characteristics: 

•  It,  should  keep  track  of  what  the  main  al¬ 
ternatives  are  to  its  interpretations  of 
events,  so  that  interpretations  can  change 
rapidly  if  necessary,  and  so  that  events  can 
be  locally  reinterpreted  where  this  is  in¬ 
dicated,  in  order  to  achieve  as  much  as 
possible  a  unified  and  coherent  interpreta¬ 
tion  of  the  situation. 
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•  It  should  use  control  strategies  with  good 
computational  characteristics  to  avoid  the 
potentially  very  explosive  combinatorics  of 
hypothesis  assembly  and  criticism.  Com¬ 
putational  strategies  should  be  feasible  in 
the  sense  that  they  should  be  efficient  and 
scale  up  well.  Criticism  is  potentially  very 
expensive  and  needs  to  be  used  judiciously. 
“Jumping  to  conclusions”  is  not  nearly  as 
computationally  expensive  as  being  careful 
to  systematically  rule  out  alternative  inter¬ 
pretations.  An  ideal  system  would  be  a 
good  guesser  to  start  with,  and  a  judicious 
critic;  able  to  use  small  amounts  of 
criticism  to  good  advantage,  and  more 
criticism,  when  computational  resource  are 
plentiful,  to  even  greater  advantage.  It  is 
not  feasible  to  explicitly  generate  all  possible 
composite  hypotheses. 

•  It  is  important  to  be  able  to  project  the  in¬ 
ferred  plans  forward  in  time,  both  for  pur¬ 
poses  of  counterplanning  and  for  monitoring 
the  unfolding  of  events  to  confirm  or  revise 
hypotheses. 

•  It  should  use  reasoning  processes  which  are 
explainable.  Insofar  as  its  reasoning 
processes  are  well  designed,  and  reasonably 
common-sensical,  this  should  not  present 
too  much  of  a  problem.  It  becomes  a 
major  problem  only  if  its  reasoning 
strategies  are  counterintuitive  or  incom¬ 
prehensible.  For  example  important  conclu¬ 
sions  shouldn’t  fall  like  magic  out  of  unfol- 
lowable  number  manipulations. 

Elements  of  the  Design 

Watching  the  Board:  Actor-Centered  Abducers 

As  we  said  earlier,  output  from  the  data  fusion 
process  is  presented  to  the  situation  assessment  system 
as  the  behavior  of  actors  in  the  map  region  of  inter¬ 
est.  This  activity  can  be  thought,  of  as  being  dis¬ 
played  on  a  map  board  which  has  symbols  on  it 
representing  the  actors,  and  packaging  what  we  know 
about  them.  The  map  can  be  thought  of  as  having 
variable  resolution,  including  symbols  for  military  units 
over  a  range  of  scales.  For  example  army  divisions, 
regiments,  battalions,  and  companies  might  all  be 
represented  and  linked  hierarchically  within  one 
dynamic  mapboard  data  structure. 

In  order  to  organize  the  problem  solving,  we  propose 
assigning  one  abducer  to  watch  each  actor  symbol  on 
the  board.  The  abducer's  job  is  to  track  the  actor's 
activities,  and  continually  strive  to  explain  its  actions 
by  composing  hypotheses  about,  its  plans  and  objec¬ 
tives.  This  way  the  problem  of  explaining  the  totality 
of  an  adversary  s  behavior  can  be  decomposed  into  the 


distinct  subproblems  of  explaining  the  behavior  of  each 
agent.  Besides  being  a  useful  way  to  decompose  the 
overall  aSductive  task  into  manageable  and  semi- 
independent  subtasks,  this  decomposition  allows  for 
parallel  processing  of  the  input  data  stream  to  the 
system,  which  has  clear  advantages  for  speed  of  com¬ 
putation. 

The  watching  abducers  can  be  thought  of  as  forming 
a  hierarchy  corresponding  to  our  best  estimate  of  the 
adversary’s  hierarchy  of  organization  and  command. 
Abducers  communicate  up,  down,  and,  when  useful, 
across  their  hierarchy  in  order  to  cooperate  in  forming 
a  coherent  overall  account  of  the  enemy’s  behavior. 
Abducer  intercommunication  is  used  to  propagate  the 
inferential  leverage  provided  by  high-confidence  local 
conclusions  wherever  they  may  appear  in  the  hierar¬ 
chy. 

In  the  activity  of  each  watching  abducer,  merely 
finding  a  plausible  plan  that  includes  the  observed  ac¬ 
tions  is  not  sufficient,  not  even  if  it  coheres  well  with 
everything  else  we  know  at  that  point.  Beyond  that, 
we  need  to  know  correctly  and  realistically  what  the 
adversary’s  plan  is,  so  practically  we  have  to  know 
how  sure  we  are  that  our  judgment  is  correct,  and  for 
example  that  there  is  no  other  good  explanation  for 
the  actions.  Finding  a  plausible  including  plan  isn’t 
enough,  we  also  have  to  subject  the  inference  to 
criticism  so  we  know  what’s  sure  and  what  isn’t; 
what’s  the  only  plausible  interpretation  of  events,  and 
what’s  conjecture.  Overall  the  system’s  task  is  to 
form  a  coherent  theory  of  the  actions  of  the  adversary, 
coherent  at  all  levels,  and  to  evaluate  the  confidence 
status  of  that  theory  in  whole,  and  in  each  of  its 
parts. 

The  proposed  design  for  each  abducer  is  derived 
from  work  on  abduction  engines  which  has  its  origins 
in  medical  diagnosis.  [11,  19,  12,  16]  In  brief,  each  ab¬ 
ducer  is  to  be  a  specialized  means-ends  problem  solver 
whose  goal  is  to  explain  the  significant  findings  as 
well  as  possible  by  forming,  criticizing,  maintaining, 
and  improving  a  compound  explanatory  hypothesis. 
Along  the  way  it  keeps  careful  account  of  just  how 
good  an  explanation  it  has  built,,  which  parts  are  firm, 
and  which  are  only  guesses.  This  basic  strategy  for 
forming  and  criticizing  composite  hypotheses  has  al¬ 
ready  stood  the  test  of  a  working  implementation  for 
a  real-world  task  [20  ,  and  has  been  described  in  some 
detail  elsewhere  12,  16:.  In  a  companion  paper  to 
this  one  we  show  a  wav  of  producing  concurrent 
realizations  of  these  sorts  of  abduction  engines. 
(Concurrent,  processing  within  each  abducer,  not  just 
among  them  as  described  above.)  The  new  domain 
can  be  expected  to  introduce  new  challenges  for 
designing  this  sort,  of  abduction  engine,  and  many 
details  remain  to  be  worked  out,  but  the  basic  ap¬ 
proach  has  already  been  proved  to  be  workable. 
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Classification  Problem  Solving  to  Control  Plan 
Recognition 

The  abducers  will  need  sources  of  explanatory 

hypothesis  fragments  to  synthesize  into  local  composite 
explanations.  The  organizing  principle  is  that  more 
general  schemata  are  at  the  “root”  or  top  of  the 

hierarchy,  and  more  detailed  ones  are  below;  the 

hierarchy  is  ordered  by  the  specialization  or  type- 
subtype  relation.  A  tank  battalion,  for  example,  is  as¬ 
sociated  with  certain  prestored  hierarchies  of  plan 

schemata  appropriate  just  for  tank  battalions.  The 
actor’s  watching  abducer  will  use  these  hierarchies  at 
run  time  as  sources  of  plausible  hypothesis  fragments. 
See  Figure  1  for  a  sketch  of  a  hierarchy  derived  from 
a  U.S.  Army  field  manual  8  . 

Associated  with  each  each  node  in  the  hierarchy  is 
precompiled  recognition  knowledge  that  measures  the 
confidence  with  which  that  node  can  form  a 
hypothesis  for  the  activity  currently  under  considera¬ 
tion.  This  recognition  knowledge  is  a  place  where  can 
locate  knowledge  that  will  allow  us  to  make  a  quick 
decision  whether  the  hypothesis  can  be  ruled  out  or 
confirmed.  If  the  hypothesis  can  be  neither  ruled  out 
nor  confirmed  by  a  quick  check  of  the  situation,  more 
involved  and  expensive  types  of  reasoning  will  be 
employed  at  the  appropriate  times.  Each  node  is 
represented  by  a  classification  specialist ,  a  problem 
solving  agent  with  embedded  knowledge  for  its  par¬ 
ticular  classification  task.  When  one  of  the  classifica¬ 
tion  hierarchies  is  activated  hy  a  working  abducer,  the 
classification  problem  solving  proceeds  top-down  follow¬ 
ing  what  we  have  called  an  estabhsh-refme  control 
regime.  Each  classification  specialist  either  rules  out 
its  hypothesis,  pruning  the  search  tree  at  that  level  of 
generality,  or  establishes  its  hypothesis  and  passes  ac¬ 
tivation  and  control  along  to  its  suhspecialists  in 
parallel.’  A  third  possible  action  for  a  classification 
specialist  is  to  suspend  processing,  hased  on  inter¬ 
mediate  levels  of  confidence,  to  be  reawakened  by  the 
abducer  if  initially  more  promising  hypotheses  fail  to 
work  out.  By  using  this  form  of  control  the 
hypothesis  spare  can  be  quickly  explored,  and  a  small 
number  of  plausible  hypotheses  found  which  are  wor¬ 
thy  of  further  investigation.  There  already  exists  a 
tool,  G'SRb,  for  implementing  this  sort  of  classification 
problem  solving  3  . 


’Til is  is  the  tluril  opportunity  we  have  found  for  parallelism, 
the  other  two  being  parallel  abduction  on  parallel  input  streams, 
and  concurrent  processing  within  each  abduction  engine.  This 
lime  advantage  is  being  taken  of  I  he  parallelism  which  is  nalnral 
lo  the  hierarchical  classification  task  A  fourth  opportunity  for 
parallelism  is  provided  by  parallel  evaluation  "I  precompiled 
rerognil  ion  features  for  each  plan  fragment,  but  we  will  not  dis 
( ilss  that  here. 


This  highly  modular  and  controlled  way  of  perform¬ 
ing  the  necessary  initial  plan  recognition  is  in  marked 
contrast  to  other  approaches  to  plan  recognition  that 
have  been  proposed  21,  13,  6  . 

Need  for  an  Intelligent  Map  Overlay 

In  order  for  the  compiled  plan  recognition  knowledge 
to  function  within  each  classification  specialist,  the 
specialist  will  need  to  make  pointed  queries  to  the 
map  board  to  determine  the  facts  relevant  to  its  deci¬ 
sion.  But  the  level  of  abstraction  of  the  facts  needed 
for  recognition  will  probably  not  match  well  to  the 
level  of  abstraction  explicitly  represented  on  the  map 
board.  For  example  in  order  to  decide  whether 
‘attack’  is  an  appropriate  high-level  hypothesis,  the 
recognizer  might  want  to  know  whether  the  enemy  is 
moving  towards  or  away  from  friendly  positions.  This 
would  probably  not  be  a  fact  that  is  stored  explicitly 
and  in  those  terms  on  the  map  board,  it  will  be 
necessary  to  infer  it  from  lower-level  descriptions. 
Thus  we  postulate  an  intelligent  database  overlay  for 
the  map  to  provide  forms  of  inference  that  support 
presenting  the  data  to  the  plan  recognizers  at  a  level 
of  abstraction  above  that  of  the  raw  data.  This  is  a 
form  of  the  data  abstraction  task  identified  by 
Chandrasekaran  and  Mittal  [4 j  and  Clancey  [7|.  A 
number  of  forms  of  spatial  reasoning  will  be  necessary 
here,  just  which  sorts  will  have  to  be  determined  em¬ 
pirically. 

Planning  Components  to  Instantiate  Plan  Schemata, 
Find  Routes,  Allocate  Resources,  and  Determine 
Targets  and  Objectives 

When  a  classification  node  matches  and  establishes 
its  sponsored  plan  schema  at  a  certain  level  of  con¬ 
fidence,  it  will  often  be  necessary  to  go  into  the 

represented  hypothesis  in  significantly  more  detail. 
It’s  not  enough  to  know  that  the  actor  is  plausibly 

following  a  plan  for  a  certain  type  of  attack,  we  want 
to  know  where  the  attack  is  coming,  what  the  targets 
are,  and  what  are  the  likely  avenues  of  approach. 

Once  a  pla  :  schema  has  been  found  to  be  plausibly 

applicable  to  the  case,  it  will  be  necessary  to 

instantiate  the  plan  schema  in  some  detail  to  deter¬ 

mine, 

•  whether  the  plan  can  in  fact  be  carried  out; 
and 

•  what  the  probable  details  are.  so  that  im¬ 
portant  elements  like  target,  etc.  can  he  un¬ 
covered,  and  so  that,  the  unfolding  of  the 
plan  rail  be  projected  into  the  future. 

Thus  the  system  must  be  able  to  take  the 

adversary’s  point  of  view  t.o  choose  targets,  plan 

plausible  routes,  allorate  resources  and  so  on.  It,  is 
likely  that,  target /objective  identification  is  a  special¬ 
ized  need  of  the  plan  schema  instantiation  process, 
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and  will  need  it’s  own  specialized  problem  solver  with 
its  own  knowledge  organization  and  problem-solving 
strategies.  There  might  well  be  other  specialized  sub¬ 
tasks  calling  for  specialized  reasoning  modules,  but  for 
now  let  us  suppose  that  all  such  reasoning  can  be 
lumped  together  into  one  large  planning  module 
capable  of  taking  a  plan  schema  as  input,  and  instan¬ 
tiating  the  schema  for  the  concrete  situation  by  choos¬ 
ing  targets,  etc.  This  planner  will  return  a  plan  to 
the  abducer  (if  possible)  with  information  about  the 
degree  of  feasibility  of  the  plan  and  with  details  filled 
in  as  far  as  this  can  be  done  without  combinatorially 
exploring  a  space  of  branching  alternatives. 

The  abducer  and  the  planner  communicate  through 
a  shared  language  of  plan  fragment  representation. 
From  the  abdncer’s  point  of  view  plan  fragments  are 
hypotheses  about  behavior,  to  be  constrained  by  what 
is  plausible  and  achievable;  while  from  the  planner’s 
they  are  proposed  courses  of  action,  to  tie  filled  in 
with  details,  while  being  constrained  by  the  facts  as 
best  they  are  known. 

Representing  Pits  «,  Goals,  Behavior,  Stales,  and 
Intentions 

The  proposed  language  for  representing  plans,  plan 
functions,  behaviors  by  which  plan  objectives  are  ach¬ 
ieved,  intermediate  states,  and  the  roles  of  various  ac¬ 
tors  in  plans,  is  the  Serribugamoorthy  and 
Chandraseitaran  Functional  Representation  Language. 
This  ha  ibeen  described  in  (14]  and  has  been 
elaborated  fur  plan  representation  in  [5], 

The  basic  idea  is  that  an  overatl  plan  (or  device)  is 
represented  as  an  organized  network  of  plan  fragment 
frames.  The  main  scaffolding  of  this  network  is  a  set 
of  what,  how  frames,  each  frame  associating  a  plan 
functionality  (what  is  achieved)  with  the  behavior  of  a 
particular  agent  (the  how).  Agent  behavior  is 
represented  as  a  directed  network  of  plan  states,  each 
state  transition  link  representing  a  plan  step,  i.c. 
something  that  needs  to  be  accomplished  in  order  for 
the  overall  behavior  to  proceed  as  banned.  Each  of 
the  state-state  links  is  annotated  by  k  ing  frame-fr«,rne 
linked  to  specification  of  how  that  particular  state 
change  is  supposed  to  be  accomplishes!,  ('or  example 
a  change  from  plan  state  ‘target  unscathed'  »■;  ate 
‘target  softened  up'  might  be  annotated  ivi'k  a  Pnk  to 
a  certain  artillery  subunit  frame,  thus  specifying  the 
agent  that  will  be  responsible  for  accomplishing  the 
transition,  and  packaging  how  it  will  be  done. 

Besides  frames  describing  various  types  of  agent  and 
their  behaviors,  there  should  be  frames  for  pack  ging 
detailed  sub-behaviors  including  maneuvers  and  rou'os, 
and  for  packaging  knowledge  of  how  other  state  tran¬ 
sitions  occur  (for  example  night  falls  naturally  without 
action  being  required.)  Besides  a  state  link  annotation 
of  how  tiie  transition  will  be  accomplished,  a  link 
should  be  annotated  with  information  about  how  long 


it  will  take,  and  with  the  rationale  for  the  particular 
step’s  presence  in  the  behavior  (for  example  to  estab¬ 
lish  a  precondition  for  a  later  step.) 

Figure  2  shows  a  partially  ordered  network  of  plan 
states  with  links  annotated  to  represent  some  of  the 
types  of  frames  that  can  be  given  responsibility  for 
various  state  transitions. 

Summarizing  the  problem  solving  process 

•  Something  changes  on  the  map  board,  caus¬ 
ing  the  appropriate  abducer  to  wake  up 
and  try  to  accommodate  the  new  infor¬ 
mation  into  its  understanding  of  what  the 
actor  is  doing. 

•  This  abducer  checks  whether  this  new  ac¬ 
tivity  is  already  anticipated  by  its  current 
hypothesis.  If  so,  then  this  hypothesis  can 
be  updated  in  detail,  and  revised  as  to  con¬ 
fidence  level.  If  not,  then  the  hypothesis 
will  have  to  be  reconsidered,  and  perhaps  a 
wholly  new  hypothesis  formed. 

•  Unanticipated  activity  represents  something 
that  needs  to  be  explained,  and  appropriate 
sources  of  hypotheses  for  the  type  of  agent 
and  activity  are  consulted  to  find  plausible 
explanatory  hypotheses. 

•  Plausible  hypotheses  are  explored  by  the 
planner  to  see  if  details  can  be  successfully 
worked  out,  and  to  see  of  they  will  succeed 
in  explaining  the  activity. 

•  If  they  are  useful  for  explaining  things,  suc¬ 
cessful  hypotheses  will  bo  pursued  by 
moving  lower  in  the  plan  classification 
hierarchy,  and  invoking  the  planner  on  vi¬ 
able  alternatives  to  try  to  fill  in  details. 

On  each  invocation  the  planner  will  only 
fill  in  details  as  far  as  it  can  go  without 
significant  branching  of  alternatives,  an¬ 
nouncing  disjunctive  alternatives  and  stop¬ 
ping,  rather  than  pursuing  disjunctions 
within  disjunctions. 

•  The  more  detailed  plausible  In  pot. hoses 
determined  in  this  manner,  generated  from 

.estcreci  schemata  in  consultation  with  the 
planner,  become  resources  available  t.o  the 
abducer  for  inclus.on  ill  a  best  composite 
hypothesis  for  the  activities  of  the  par¬ 
ticular  aevor. 

Tin  H'jJucer  forms  its  b'-st  explanation 
ii,  hi*  the  plausible  plan  fragments,  and 
taking  account  of  the  coherence  constraints 
and  suggestive  information  made  available 
by  other  abducent  above,  below,  and  lateral 
in  the  abducer  hierarchy. 
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•  Overall  this  process  of  forming  local  best 
explanations  in  all  of  the  active  abducers  in 
parallel  should,  hopefully,  settle  down  before 
very  long,  because  each  abducer  quits  when 
it  has  done  the  best  it  can  locally,  and  be¬ 
cause  the  action  of  all  of  those  local  ab- 
ducers,  each  trying  to  form  a  local  best  ex¬ 
planation  of  what  is  going  on,  and  each 
taking  account  as  best  it  can  of  the 
hypotheses  formed  by  its  conceptual  neigh¬ 
bors,  collectively  tends  towards  producing 
the  best  possible  answer  globally. 

If  our  representations  are  expressive  enough, 
and  our  hypothesis-improvement  control  is 
aptly  wrought,  then  the  best  answer 
globally,  towards  which  the  system  tends, 
will  be  the  “right  answer”,  the  actual  plans 
of  the  other  guy.  Thrashing  about  of  the 
system  thus  has  two  limits.  From  the  out¬ 
side  the  limit  is  the  way  the  world  actually 
is,  whatever  is  really  happening  out  there, 
which  the  system  is  designed  to  infer,  and 
towards  which  it  hopefully  tends.  From 
the  inside  the  limit  is  imposed  by  clever 
strategies  by  which  the  problem  solver  quits 
trying  to  improve  the  hypothesis  after  it 
has  done  the  best  it  can  with  the  infor¬ 
mation  available,  and  after  a  modest  com¬ 
mitment  of  computational  resources. 

•  Overall,  the  process  is  one  where  islands  of 
higher  certainty  are  established,  and  a  wave 
of  probable  reasoning  drives  outward  from 
them  to  see  what  else  can  be  plausibly  in¬ 
ferred. 
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Figure  2:  Representing  Plan  Understanding 
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Abstract 

The  information  processing  task  in  abductive 
reasoning  is  to  infer  a  best  explanation  for  a  set  of 
data.  Some  typical  subtasks  of  this  are  generating 
hypotheses  that  can  account  for  various  portions  of 
the  data,  and  synthesizing  a  composite  hypothesis  that 
best  explains  t.h°  whole  data  set.  In  this  paper  we 
provide  task-specific  concurrent  algorithms  for  some  of 
the  subtasks  of  abductive  reasoning.  In  particular  we 
present  a  blackboard  architecture  and  a  marker 
algorithm  for  the  task  of  synthesizing  a  composite 
hypothesis. 

1.  Introduction 

Abductive  inference  has  received  significant 
attention  in  research  on  knowledge-using  reasoning, 
and  construction  of  knowledge-based  systems  [Charniak 
and  McDermott,  1985;  Josephson  et  al .,  1987;  Pople, 
1977;  Reggia,  1983],  The  information  processing  task 
in  abductive  reasoning  is  to  infer  a  hypothesis  that 
best  explains  a  set  of  data.  A  typical  subtask  is  to 
generate  hypotheses  that  account  for  different  subsets 
of  the  data.  Another  subtask  is  to  use  these 
hypotheses  in  synthesizing  a  composite  hypothesis  that 
best  explains  the  entire  data  set.  However  synthesis 
of  a  composite  explanatory  hypothesis  in  the  presence 
of  certain  types  of  interactions  between  component 
hypotheses  may  be  computationally  very  expensive 
Allemang  et  al .,  1987].  This  suggests  exploiting 

concurrency  for  the  construction  of  abductive  inference 
making  systems.  Indeed  with  the  increasing 

availability  of  concurrent  machines,  exploration  and 
exploitation  of  roncurre”cy  in  abductive  reasoning  is 
quite  timely.  Moreover  analyzing  the  processing 
dependencies  to  determine  where  concurrent 
mechanisms  apply,  can  be  expected  to  increase  our 
understanding  of  abductive  t  roblcm  solving  generally. 

We  use  the  term  “concurrency”  here  to  imply 
non-serial  processing  which  has  characteristics  of  both 
parallel  ana  distributed  processing.  In  Al  research 
concurrency  is  being  explored  at  several  different  levels 


of  organization: 

•  Neural  architecture  level:  e.g.  research  on 
parallel  and  distributed  processing  in  the 
connectionist  paradigm.  At  this  level,  the 
grain  size  of  concurrently  executable 
processes  is  very  small,  and  the  parallelism 
between  them  is  massive. 

•  Language  architecture  level:  e.g.  research  on 
parallelism  at  the  level  of  Lisp,  or  Prolog. 

•  Symbolic  architecture  level:  e.g.  research  on 
parallelism  at  the  level  of  production  rules. 

•  Knowledge  architecture  level:  functionally 
accurate,  cooperative  systems  which  use  a 
blackboard  architecture  for  control  and 
communication  in  distributed  problem  solving 
[Lesser  and  Corkill,  1983],  is  an  example  of 
this  kind  of  work.  At  this  level,  the  grain 
size  of  concurrently  executable  processes  is 
medium  to  large,  and  he  parallelism 
between  them  is  moderate  to  limited. 

Our  current  research  on  concurrency  in  abductive 
reasoning  is  also  at  the  knowledge  architecture  level  of 
abstraction.  Thus  our  analysis  is  in  the  language  of 
functional  specifications  of  problem  solving  tasks  and 
subtasks,  mechanisms  for  problem  solving  in  the  form 
of  appropriate  knowledge  and  control  structures,  and 
communication  between  cooperating  problem  solvers. 

2.  Abductive  Reasoning 

2.1.  The  Form  of  Abduction 

Abduction  is  a  form  of  logical  inference  that  may 
be  characterized  as  follows  Josephson  et  al.,  1 987 j : 


0  is  a  collection  of  data 

(facts,  observations,  givens), 
C  explains  D 

(would,  if  true,  explain  D), 
No  other  hypothesis  explains  D 
as  well  as  C  does. 


Therefore,  C  is  (probabh)  correct. 

Miductive  inference  appears  to  he  ubiquitous  in 
knowledge  using  reasoning  processes.  Abduction 
occurs  in  diagnostic  problem  solving,  where  M  data  is 
in  the  form  of  symptoms,  and  t.he  anatory 
hypotheses  are  diseases  or  component  m  unctions. 
Data  interpretation  (as  in  DENDRAL)  where  the  data 
is  in  the  form  of  sensor  readings,  and  the  explanatory 
hypotheses  are  about  object  structures;  and  military 
situation  assessment  where  the  data  is  in  the  form  of 
events,  and  the  explanatory  hypotheses  are  plans 
ascribed  to  the  adversary,  are  also  instances  of 
abductive  reasoning.  Some  aspects  of  perception,  and 
some  aspects  of  natural  language  understanding, 
appear  to  be  abductive  in  character  as  well. 

2.2.  Abductive  Task  and  Subtrsks 

Our  research  on  abduction  takes  place  in  t.he 
context  of  a  theory  of  generic  tasks  in 
knowledge-using  problem  solving  Chandrasekaran, 
1986.  A  generic  task  corresponds  to  a  primitive  typo 
of  reasoning,  associated  with  which  are  organizations 
of  knowledge  and  controls  of  problem  solving 
appropriate  for  if.  Classification  of  a  set  of  data 
describing  a  specific  situation  onto  a  precompiled 
taxonomy  of  hypotheses  for  instance  is  one  generic 
task;  another  is  flit1  abductive'  assembly  of  a 
composite  explanatory  hypothesis  using  as  components 
hypotheses  that  account  for  different  subsets  of  the 
data.  'I'lte  organization  of  knowledge  and  control  of 
problem  solving  appropriate  for  the  classification  task 
are  different  from  that  for  the  assembly  task. 

Generic  tasks  provide  a  high-level  vocabulary  for 
characterizing  complex  reasoning  processes,  and  provide 
high-level  building  blocks  for  constructing  integrated 
knowledge-using  systems,  .losephson  el  til.  Josephson 
e.l  til.  1987  have  shown  that  the  abductive  task  can 
be  decomposed  under  some  circumstances  into  the 
generic  tasks  of  hierarchical  classification.  and 
assembly  crit icism  of  a  best  composite  explanatory 
hypot  Itesis. 

2.2.  \bdurtivc  \ssenibly  Systems 

Probably  the  best  known  knowledge-using  system 
that  performs  a  lor  in  of  abductive  assembly  is  the 
INTERNIST  system  for  diagnosis  in  internal  medicine 
Miller  tl  <(/.,  1982.  INTERNIST  will  continue  to 
conjoin  further  diseases  to  a  growing  diagnostic 
conclusion  until  all  of  tin1  important  findings  have 
been  accounted  for.  The  DENDRAL  system 


Buchanan  et  al .,  1969  performs  abductive  assembly  in 
its  task  of  elucidating  molecular  structure  from  mass 
spectrogram.  DENDRAL  assembles  an  explanatory 
hypothesis  in  the  literal  sense  that  it  assembles  a 
model  of  a  molecule  that  represents  an  hypothesis 
about  t.he  parent  molecule  causing  the  various  lines  in 
the  mass  spectrographic  data.  This  model  molecule  is 
assembled  from  fragments  representing  hypotheses 
about  what  is  causing  various  spectral  lines. 

The  RED  system,  an  integrated  expert  si  'em 
for  identifying  red-cell  antibodies  for  use  in  medical 
blood  banks,  explicitly  uses  a  classification  and 
assembly  mechanism  for  performance  of  the  abductive 
task.  In  RED  a  classification  module  systematically 
searches  a  space  of  precompiled  hypotheses  to  find 
ones  plausibly  applicable  to  the  case,  determines  the 
pntna  facie,  likelihoods  of  the  plausible  hypotheses  that 
are  found,  and  determines  what  each  plausible 
hypothesis  can  explain  of  the  data  for  the  case.  An 
assembly  module  considers  the  hypotheses  with  high 
likelihoods  as  candidates  for  inclusion  into  a  composite 
explanatory  hypothesis,  and  assembles  the  composite 
hypothesis  that  is  a  best  explanation  for  the  data  set,. 

The  MDX2  system,  an  integrated  expert  system 
for  diagnosis  of  a  class  of  diseases  in  internal 
medicine,  also  uses  the  classification  and  assembly 
mechanism  [Stick len,  1987  .  V1DX2  contains  multiple 
classification  modules  that  perform  their  duties  in 
different  areas  of  medicine.  It  also  contains  an 
assembly  module  that  directs  the  activities  of  the 
classification  modules,  and  assembles  a  composite 
explanatory  hypothesis  using  component  hypotheses 
from  the  different  classification  modules. 

I’EIROE  is  a  knowledge  representation  language 
or  tool  under  development  at.  Ohio  State  for 
constructing  problem  solving  systems  which  assemble  a 
composite  explanatory  hypothesis,  using  as  components 
hypotheses  that  account  for  different  subsets  of  the 
data. 

3.  Concurrent  Classification 

2.1.  Hierarchical  Classification 

The  RED  and  the  N1DX2  systems  use 
hierarchical  classification  to  accomplish  n  part  of  the 
problem  solving.  In  hierarchical  classification 

precompiled  hypotheses  are  organized  into  a  taxonomic 
hierarchy  Gomez  and  Chandrasekaran.  198  1  . 
Associated  with  each  hypothesis  is  a  specialist  for  that 
hypothesis,  a  problem  solving  agent  with  embedded 
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ihen  the  hypothesis  is  said  to  be  “established”,  else  it 
is  “rejected”.  If  the  hypothesis  is  established,  then 
the  specialist  attempts  to  refine  its  hypothesis  by- 
sending  messages  invoking  each  of  its  subspecialists, 
which  then  repeat  the  process.  If  a  hypothesis  is 
rejected  then  its  subhypotheses  are  implicitly  rejected 
as  well,  thus  pruning  the  search.  In  this  way, 
following  the  top-down  prnne-or-pursue  control  regime 
that  has  been  called  “establish-refine”,  the  hypothesis 
space  is  efficiently  searched. 

Thus,  a  community  of  specialists  cooperate  to 
perform  the  task  of  hierarchical  classification.  Since 
knowledge  is  distributed  among  the  hierarchically 
organizer.  specialists,  and  since  the  control  of 
problem-solving  is  top-down,  the  refinement  of 
established  hypotheses  can  be  done  m  parallel.  An 
algorithm  for  concurrent  classification  is  given  in 
Figure  1. 

Match  hypothesis  with  relevant  subset  of  data 
Compute  likelihood  value  for  hypothesis 
If  likelihood  value  is  high 
then 

Establish  hypothesis 

if  the  specialist  is  a  leaf  specialist 

then 

STOP 

else 

Invoke  all  subspecialists 

STOP 
end  if 

else 

Reject  hypothesis 
STOP 
end  if 

Figure  1:  Concurrent  Classification 

,¥.£>.  Multiple  Classification 

The  MDX2  system  contains  several  classification 
modules,  each  responsible  for  the  classification  task  in 
its  respective  subarea  ol  medicine.  Multiple  hierarchy 
classification  is  an  instance  of  distributed 
problem-solving  at  two  levels  of  organization:  at  the 
level  of  distinct,  classification  modules,  and  .it  the  level 
of  distinct  classification  specialists  within  each  module. 
\\  (>  have'  just  provided  i  concurrent,  algorithm  for  the 
second  level.  Clearly  it  is  possible  to  run  the  different 
classification  modules  concurrently  as  well. 

4.  Concurrent  Assembly 

.). /.  I  Haste  Serial  Hypothesis  Assembler 

I, el  / )  { d },  /  1,3 . n  be  a  set  of  data  items, 

and  let  II  {/i  }.  J  1.3 . in  be  a  set  of  hypotheses. 

We  assume  that  each  h  II  is  associated  with 

information  specifying  which  specific  elements  ol  I)  it 
can  account  for,  and  specifying  the  likelihood  with 


which  it  can  account  for  them.  For  the  basic 
assembler  we  assume  that  the  elements  of  //  are 
mutually  compatible,  represent  explanatory  alternatives 
where  their  explanatory  capabilities  overlap,  and 
otherwise  do  not  interact  with  each  other. 

The  task  of  the  assembler  is  to  build  a  best 
composite  hypothesis  C  for  explaining  the  elements  of 
D,  using  the  members  of  II  as  candidate  parts.  Note 
that  there  is  no  a  priori  guarantee  that  a  unique  best 
explanation  exists.  The  assembler  builds  the 
composite  hypothesis  C  using  a  specialized  means-ends 
machine  whose  goal  is  a  complete  explanation.  The 
assembler  detects  differences  between  the  goal  state 
(all  of  D  has  been  explained),  and  the  present  state 
(some  dfl  has  not  been  explained).  It  then  selects  an 
h  from  II  which  can  explain  the  unexplained  dfr,  and 
integrates  this  h  into  the  growing  composite  hypothesis 
C.  Since,  as  we  have  postulated,  the  h%  are 
non-interacting,  “integrating  '  li  into  C  just  amounts 
to  logically  conjoining  it  with  what,  is  already  there. 

There  are  three  things  that  ran  happen  when 
trying  to  explain  some  djr: 

1.  There  may  be  no  h  f  II  that  can  account 
for  it.  Then  d,.  is  unexplainable. 

2.  There  may  be  only  one  li  >-  II  that  ran 
account  for  it.  Then  this  h  is  essential. 

3.  There  may  be  more  than  one'  h  II  that 
can  account  for  </f;.  Then  the  li  accounting 
for  df,  which  has  the  highest  likelihood 
value  should  be  selected  for  integration  into 
C.  If  the  likelihood  values  for  two  or  more 
h  are  the  same,  then  selecting  between 
them  is  based  on  some  measure  of  overall 
explanatory  power,  or  if  that  will  not  break 
the  tie,  then  selection  is  made  at  random. 

4-3.  A  Concurrent  Hypothesis  Assembler 

•1.2.1.  An  Architecture  for  Concurrent  Assembly 

There  are  two  types  of  questions  that  are  raised 
during  the  basic  assembly.  The  first  type  is  from  the 
viewpoint  of  each  d  D,  and  is  of  the  form  "Which 
hypothesis  in  II  can  best,  explain  me?”.  For 

concurrent  assembly  this  type  of  question  can  be  asked 
and  answered  for  each  element  of  I)  independently  of 
others,  and  in  parallel.  The  second  type  of  question  is 
from  the  viewpoint  ol  each  h  II.  and  is  ol  the  loiru 
•Which  elements  of  I)  should  I  be  used  to  explain?". 
Again,  this  tvpe  of  question  can  be  asked  and 
answered  for  each  h  II.  independently  ol  tin  otheis, 
and  in  parallel.  I.et  P  be  a  set  of  li  processors 
corresponding  to  the  d.  and  let  II  be  a  set  ol  m 
processors  corresponding  to  the  b.  Ihen 

,/  ,  /  3,... ,n  and  h.  I  1,3... ..in  may  individually 

reside  on  the  ti  nt  processors.  We  will  also  need  an 
additional  processor  I!  lor  collecting  results.  We 
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assume  that  each  of  the  n+m+1  processors  is 
equipped  with  a  local  memory,  and  performs  its  task 
using  only  local  resources  unless  otherwise  noted. 

The  communication  between  the  processors,  and 
the  control  of  problem  solving  can  he.  achieved  by 
using  a  blackboard  architecture.  In  this  particular 
framework  for  concurrent  assembly  the  blackboard  is 
used  only  as  a  shared  data  structure  on  a  shared 
memory.  The  blackboard  may  be  divided  into  two 
sides,  a  data  side,  and  a  hypothesis  side.  The 
blackboard  contains  the  state  of  the  problem-solving  at 
any  given  time,  initially  containing  dt,  i  1,2...  ,n  on 
the  data  side,  and  h  ,  j  on  the  hypothesis 

side.  The  hypothesis  side  also  contains,  for  each 
h  e  //,  a  list  of  the  specific  elements  of  D  for  which 
it  can  account,  and  the  likelihood  value  with  which  it 
can  account  for  them.  The  initial  information  on  the 
blackboard  may  be  posted  by  classifiers  (concurrent  or 
not),  or  hy  some  other  source  of  plausible  hypotheses. 

Kacli  of  the  nhm  +  1  processors  lias  access  to 
both  sides  of  the  blackboard;  each  processor,  when 
idle,  is  ‘•looking”  at  the  blackboard.  A  processor  gets 
invoked  when  appropriate  marks  are  placed  on  the 
blackboard;  each  processor,  when  ir-oked,  performs  its 
task  and  writes  its  results  on  tne  blackboard  hy 

placing  appropriate  marks.  Finally,  the  hypotheses 
with  appropriate  marks  are  collected  into  (. 

The  semantics  of  the  marks  that  may  be  placed 
on  the  blackboard  are  as  follows: 

«  The  mark  of  hK  on  some  dL  G  D  implies 
that  the  datum  dL  is  explainable  by 

hypothesis  h h-. 

•  The  mark  of  Explained  on  some  <1 L  €  D 
implies  that  the  datum  d^  has  been 
explained. 

c  The  mark  of  Unexplainable  on  some 

,1 ,  l)  implies  that  the  datum  d^  is 
unexplainable  using  //. 

•  The  mark  of  Essential  on  some  hh-  11 

implies  that  t.ho  hypothesis  hh-  is  in  C  and 
is  essential  to  C  (i.e.  is  indispensable,  no 
complete  explanation  is  possible  without  it.) 

•  The  mark  of  Dost  on  some  hK  II  implies 

that  the  hypothesis  h ^  is  in  (■  and 

represents  a  most  likely  explanation  tor 
some  data  it oni. 

•  1  he  mark  of  Ill  on  some  II  implies 

that  the  hypothesis  h ^  is  in  ( '. 

I  ‘J.2.  \ n  Ugorithm  lor  ( torn  uncut  Assembly 

We  present  now  present  an  algorithm  lor 

concurrent  assembly . 


1.  Mark  what  each  hypothesis  can  explain 

Q 

for  each  h  €  II, 
for  each  d  6  D, 
if  h  explains  d, 

then  mark  d  with  the  name  of  h. 

2.  Find  unexplainable  data,  and  essential  hypotheses: 


for  each  d  €  D, 

if  d  is  not  marked  with  any  h, 

then  mark  d  as  Unexplainable, 
else  if  d  is  marked  with  only  one 

h  €  II, 

then  mark  h  as  Essential. 

3.  Find  data  items  explained  by  essential  hypotheses: 

Q,  using  the  subset  of  processors  corresponding 
to  he  II  marked  Essential: 

for  each  h  marked  Essential, 
for  each  d  €  D  such  that 
h  explains  d, 

mark  d  as  Explained. 

4.  Select  additional  hest  explanation  hypotheses 
to  cover  more  of  the  data: 

P,  using  the  subset  of  processors  corresponding 
to  d  €  D  not  marked  Explained  or 
Unexplainable: 

for  each  d  not  marked  Explained 
or  Unexplainable, 

if  there  is  a  most  likely  h, 
say  h^,  among  the  hypotheses  marked 
in  step  1  as  explaining  d. 
then  mark  hit  as  Dest. 

5.  Find  data  items  explained  by  best  hypotheses: 

Q,  for  the  suhset  of  processors  corresponding 
to  h  t  II  marked  Best: 

for  each  h  marked  Best, 

for  each  d  I)  such  that 
h  explains  d, 

mark  d  as  Explained. 

G.  Select  additional  hypotheses  by  guessing 
to  rover  the  rest  o[  the  data: 

l\  using  the  subset  of  processors  corresponding 
to  d  (  D  not  marked  Explained  or 
Unexplainable: 
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for  each  d  not  marked  Explained  or 
Uncxf  inable, 

choose  an  /i,  say  h^, 

among  the  hypotheses  marked  in  step  1 
as  explaining  d ,  and 
mark  as  In. 

7.  Collect  results  forming  the  composite  hypothesis  C: 

R. 

Collect  h  g  //  marked  Essential 
or  Best  or  In  into  C. 

4  .1  Criticism  oj  Composite  Hypothesis 

Once  the  composite  hypothesis  C  has  been 
assembled,  it  may  he  tested  for  parsimony,  and 
possibly  improved,  and  further,  component  hypotheses 
may  be  tested  for  essentialness  (some  not  previously 
counted  as  essential  may  yet  gain  that  status),  and  6 
may  be  improved  as  a  result  of  this  too. 

A  composite  hypothesis  is  parsimonious  if  it  has 
no  explanatorily  superfluous  parts.  An  hypothesis  in 
C  is  explanatorily  superfluous  if  removing  it  from  C 
does  not  reduce  the  explanatory  capability  of  C. 
After  testing  for  parsimony,  the  explanatorily 
superfluous  hypotheses  are  removed  from  C  Where  it 
is  not  possible  to  simultaneously  remove  all  of  the 
explanatorily  superfluous  hypotheses  without  leaving 
some  data  unexplained,  then  an  ordering  can  bo  made 
by  appealing  to  their  firmness  in  C,  that  is,  Best 
hypotheses  should  be  retained  in  favor  of  ones  which 
are  merely  In  (Essential  hypotheses  cannot,  turn  out 
to  be  superfluous.)  Similarly,  where  firmness  in  C 
does  not  resolve  the  choice,  retention  can  he  based  on 
the  initial  determination  of  likelihood  for  each 
hypothesis  as  it  comes  from  the  hypothesis  source. 

Algorithmically,  one  way  to  accomplish  this 
parsimony  testing  and  improvement  is  to  first,  test  all 
non-essential  hypotheses  (in  parallel)  for 
superfluousness.  If  all  of  the  superfluous  hypotheses 
thus  found  ran  be  simultaneously  removed  from  C 
without  leaving  any  datum  unexplained,  then  well  and 
good,  they  are  all  removed,  leaving  a  ,  ,u'simonious 
hypothesis.  If  they  cannot  all  be  removed,  then  just 
the  In  hypotheses  that  are  superfluous  are  removed  en 
masse,  if  that  will  not  leave  anything  unexplained.  If 
the  superfluous  lit  hypotheses  cannot  all  he  removed 
without  leaving  something  unexplained,  then  they  are 
removed  one  by  one,  ordered  by  initial  likelihood,  and 
starting  with  the  least  likely,  retaining  only  those 
which,  at  the  point  that  they  are  considered,  cannot 
be  removed  without  destroying  explanatory  coverage. 
C)nc<  the  process  ol  removing  superfluous  lit 
hypotheses  is  complete,  the  Best  hypotheses  are  again 
tested  in  parallel  for  superfluousness  (superfluous  ones 
may  have  become  non-superfluous  in  the  context  of  a 
C  improved  by  removing  superfluous  In’s.)  Once 


again  superfluous  Best’s  are  removed,  either  en  masse , 
if  that  will  not  leave  anything  unexplained,  or  one  by 
one  using  initial  likelihood  as  the  ordering  principle. 
The  result  of  this  process  will  be  to  improve  C  to  the 
pcint  that  it  is  completely  parsimonious. 

If  there  are  incompatibilities  between  hypotheses 
(see  below),  a  data  item  may  initially  (in  step  2  of 
the  assembly  algorithm  described  above)  appear  to 
have  several  potentially  explaining  hypotheses,  yet  only 
one  of  these  explanations  is  really  viable.  This  can 
occur  for  example  if  all  but  one  of  the  potential 
explainers  is  inconsistent  with  some  other  hypothesis 
which  is  Essential.  Thus  hypotheses  in  C  not 
marked  Essential  are  tested  to  see  if  indeed  complete 
and  consistent  composite  explanatory  hypotheses  can 
be  built  without  using  them.  If  not,  then  such  ones 
also  deserve  to  be  counted  as  Essential,  and  allowing 
lor  the  possibility  that  leveraging  early  islands  of 
certainty  (the  essentials)  might  make  a  difference  for 
the  final  composition  of  C,  the  algorithm  described 
above  should  be  restarted  after  step  2  with  the  newly 
discovered  essentials  now  marked.  (Criticism  for 
parsimony  will  need  to  be  repeated,  but  criticism  for 
essentialness  will  not.  Essentialness  of  an  hypothesis 
is  a  global  property  of  the  setup  and  is  independent 
of  the  composition  of  C ;  thus  it  will  not  change  on  a 
subsequent  assembly  of  C.) 

4-4-  Interacting  Hypotheses 

Several  distinct  types  of  interaction  are  possible 
between  two  hypotheses  ht,  hs  G  H. 

1.  ht  and  hs  are  mutually  compatible,  and 
represent  explanatory  alternatives  where 
their  explanatory  capabilities  overlap.  This 
was  our  assumption  for  the  basic  assembler. 

2.  hg  is  a  subhypothesis  of  hr  This  can 
happen  if  the  source  of  hypotheses  is  a 
hierarchical  classifier  as  in  Hod  and  MDX2. 

3.  The  inclusion  of  ht  in  C  suggests  the 
inclusion  of  h.,.  Such  an  interaction  may 

arise  if  the  assembler  has  knowledge  ol  a 
statistical  association  between  /t(  and  hg. 

I.  If  and  h.n  cooperate  addit.ively  where  their 
explanatory  capabilities  overlap.  I  bis 

happens  in  Red's  domain. 

5.  ht  and  h:>  are  mutually  incompatible. 

(i.  and  h„  cancel  the  explanatory 

capabilities  of  each  other  in  relation  to 
some  d  I).  For  example  /t,  (being  true) 
might  imply  that  some  data  value  will 
increase,  while  h „  implies  that,  the  value 
will  decrease. 
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For  each  of  the  above  interactions  we  have 
developed  marker  algorithms  using  the  blackboard 
architecture.  However,  the  algorithms  that  we  have 
developed  for  incomptability  and  cancellation 
interactions  work  only  for  disjoint,  pair-wise 
interactions.  The  problem  with  more  involved  forms 
of  interaction  arises  because  knowledge  in  our 
architecture  is  distributed  among  the  n  +  m  +  1 
processors,  and  each  processor  performs  its  task 
locally ,  while  these  interactions  in  general  require 
qlobal  computation.  One  way  to  accommodate  the 
incompatibility  and  cancellation  interactions  between 
hypotheses  may  be  to  augment  the  blackboard 
architecture  with  use  of  critics  that  have  a  global  view 
of  the  state  of  problem-solving  as  it  appears  on  the 
blackboard 

4.5.  Hierarchical  Assembly 

In  many  domains  it  is  possible  to  form  groups  of 
strongly  interrelated  data  in  the  data  set,  for  instance 
the  groups  of  data  corresponding  to  different 
classification  modules  in  MDX2.  In  such  domains  it 
is  possible  to  build  and  use  a  hierarchy  of  several 
small  assemblers,  rather  than  one  large,  flat  assembler, 
bet  us  consider  a  two-level  hierarchy  of  assemblers. 
At  the  lower  level  in  the  hierarchy,  assemblers 
corresponding  to  different  data  groupings  fomi 
composite  hypotheses  to  account  for  the  data  typos  for 
which  they  are  specialized.  and  the  top  assembler 
forms  a  composite  from  the  sub-composites.  Notice 
that  it  is  possible  for  a  datum  to  appear  in  more  than 
one  data  grouping,  and  similarly,  it  is  possible  for  a 
component  hypothesis  to  appear  in  more  than  one 
composite  hypothesis.  Weak  interactions  at  the  lower 
level  are  reconciled  at  the  highest  level  of  the 
hierarchy,  where  all  of  the  data  needs  to  he  explained. 

This  scheme  is  generalizable  to  a  hierarchy  of 
assemblers  with  a  finite  number  of  levels.  Problem 
decomposition  activations  flow  downwards  through  the 
hier  rchy  and  composite  hypotheses  flow  upwards.  An 
approach  similar  to  this  is  being  explored  in  the 
construction  of  PliIR.CE.  The  idea  is  to  use  a 
hierarchical  organization  of  knowledge  to  perform  a 
c  nnputationally  complex  task  elficientlv  Problem 
•  olving  knowledge  is  distributed  over  the  assemblers  in 
I  lie  hieiarchv .  who  cooperatively  perform  the  overall 
assembly  task.  Notice  that  the  approach  to 

concurrent  assembly  described  earlier  complements  this 
hierarchical  assembly  Each  assembler  can  perform  its 
task  using  the  blackboard  architecture  and  marker 
algorithm  that  we  have  provided.  Further,  it  is 
possible  to  exploit  parallelism  between  the  sibling 
assemblers  at  tin  same  level  in  the  hierarchy. 

5.  Conclusions 

We  have  found  concurrency  in  abductive 
reasoning  in  different  forms  and  at  dilferent  levels  as 


summarized  below: 

•  The  classification  task  may  viewed  as  an 
instance  of  distributed  problem  solving. 
Classification  may  be  performed  by  a 
community  of  specialists  organized  in  a 
taxonomic  hierarchy.  We  have  provided  a 
concurrent  algorithm  for  the  classification 
task. 

•  At  a  higher  level  of  organization, 
classificatory  problem  solving  may  be 
distributed  among  multiple  classification 
modules.  It  is  possible  to  exploit 
concurrency  at  this  level  as  well 

•  The  assembly  task  may  be  performed 
concurrently.  Data  items  and  component 
hypotheses  can  reside  individually  on 
separate  processors.  The  data  processors, 
and  separately  the  hypotheses  processors, 
can  execute  in  parallel,  accomplishing  the 
assembly  task  in  several  distinct  waves  of 
processing  as  described  above. 

•  It  is  possible  to  perform  the  task  of 
assembly  by  a  hierarchical  organization  of 
assemblers,  where  the  processing  at  each 
level  goes  on  in  parallel. 

There  are  several  promising  lines  of  research 
from  here: 

•  The  blackboard  architecture  can  be 

augmented  using  critics  with  a  global  view 
of  the  state  or  problem  solving  in  order  to 
accommodate  incompatibility  and 

cancellation  interactions  between  hypothe.es. 

•  A  complexity  analysis  or  the  concurrent 
algorithms  can  be  conducted. 

•  The  concurrent  algorithms  can  be 

simulated,  and  their  efficiencies  tested 
empirically. 

•  Constraints  and  opportunities  imposed  1  y 
available  multiprocessor  architectures  can  be 
investigated. 

•  \  hierarchical  assembler  can  be  build. 
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Abstract 


This  report  documents  an  experiment  investigating  the 
potential  of  a  parallel  computing  architecture  to  enhance  the 
performance  of  a  knowledge-based  signal  understanding 
system.  The  experiment  consisted  of  implementing  and 
evaluating  an  application  encoded  in  a  parallel  programming 
extension  of  Lisp  and  executing  on  a  simulated  multiprocessor 
system. 

The  choosen  application  for  the  experiment  was  a 
knowledge-based  system  for  interpreting  pre-processed, 
passively  acquired  radar  emissions  from  aircraft.  The 
application  was  implemented  in  an  experimental  concurrent, 
dsyiuhtom. us  object  oriented  framework.  Thia  framework,  in 
turn,  relied  on  the  services  provided  by  the  underlying 
hardware  system.  The  hardware  system  for  the  experiment 
was  a  simulation  of  various  sized  grids  of  processors  with 
inter-processor  communication  via  message-passing. 

The  experiment  investigated  the  effects  of  various  high-level 
control  strategies  on  the  quality  of  the  problem  solution,  the 
speedup  of  the  overall  system  performance  as  a  function  of 
the  number  of  processors  in  the  grid,  and  some  of  the  issues 
in  implementing  and  debugging  a  knowledge-based  system  on 
a  message-passing  multiprocessor  system. 

In  this  report  we  describe  the  software  and  (simulated) 
hardware  components  of  the  experiment  and  present  the 
qualitative  and  quantitative  experimental  results. 


1.  Introduction 

This  report  documents  an  experiment  investigating  the 
potential  of  a  parallel  computing  architecture  to  enhance  the 
performance  of  a  knowledge-based  signal  understanding 
system.  This  experiment  was  done  within  the  Expert  Systems 
on  Multiprocessor  Architectures  Project  of  Stanford 
University's  Knowledge  Systems  Laboratory. 

The  computational  characteristics  of  complex  knowledge-based 
syst»ms  are  poorly  understood,  especially  in  parallel 
computational  environments.  Our  Architectures  Project  is 
performing  a  number  of  experiments  to  try  to  gain  some 
understanding  of  these  characteristics  and,  in  particular,  of  the 
potential  for  concurrent  execution  of  such  systems.  A 
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primary  goal  of  the  project  is  to  develop  software  and 
hardware  system  architectures  which  exploit  this  concurrency 
to  increase  the  performance  of  knowledge-based  signal 
understanding  and  information  fusion  systems. 

The  Architectures  Project  is  organized  according  to  a 
hierarchy  of  computational  abstraction  levels  as  shown  in 
Table  1-1.  Each  experiment  represents  a  narrow,  vertical  slice 
through  these  levels  and  consists  of  a  specific  system  choice 
for  each  level. 


Table  1-1:  Computational  levels. 


Level 

Research  questions 

Application 

Where  is  the  potential  concurrency  in  knowledge -based 
signal  understanding  tasks? 

How  does  the  problem  solver  recognize  and  express 
application  dependent  concurrency? 

Problem-solving 

framework 

What  are  suitable  framework  constructs  for  organising 
and  encoding  concurrent  signal  understanding  tasks? 

What  arc  appropriate  granularities  for  knowledge, 
knowledge  application  and  data  to  maximize  concurrency? 

What  types  of  strategies  for  control  of  knowledge  application 
are  needed  to  assure  acceptable  solution  quality  without 
introducing  excessive  execution  serialization? 

Knowledge 
representation 
and  management 

What  kinds  of  knowledge  representation  mechanisms  arc 
suitable  for  exploiting  concurrency  in  inference  and  search? 

System 

programming 

language 

How  can  general-purpose  symbolic  programming  languages 
be  extended  to  support  concurrency  and  help  manage  the 
resource  allocation  and  reclamation  tasks  on  a  distributed 
memory  multiprocessor? 

Hardware 

system 

architecture 

What  multiprocessc  architectures  best  support  the 
organization  and  concurrency  in  knowledge  based 
signal  understanding  applications? 

For  the  reported  experiment,  the  choosen  application  is  a 
knowledge- based  EL1NT  (ELectronics  INTelligence)  system 
for  interpreting  processed,  passively  acquired  radar  emissions 
from  aircraft.  The  ELINT  application  is  implemented  in 
TAOS  an  experimental  concurrent,  asynchronous 
object-oriented  framework  built  on  Zetalisp  [1].  The  CAOS 
framework,  in  turn,  tclies  on  the  seivitea  provided  by  the 
underlying  hardware  system  environment.  For  this 
experiment,  the  hardware  system  environment  is  a  simulation 
of  a  parallel  architecture,  called  CARE  [2].  CARE  simulates 
a  communications  grid  of  processing  sites  where  each  site 


contains  a  Lisp  evaluator,  private  memory,  and  a 
communications  and  process  scheduling  subsystem. 
Message-passing  is  the  only  means  of  inter-site 
communication.  CARE  is  simulated  using  a  general, 
event-based  simulator,  SIMPLE  [3],  SIMPLE  is  written  in 
Zetalisp  and  executes  on  a  Symbolics  3600  or  a  Texas 
Instruments  Explorer  Lisp  machine.1  Figure  1-1  illustrates  the 
relationship  between  the  various  software  components  of  the 
experiment. 


ELINT 

Interpretation  of  radar 
emissions  from  aircraft 

CAOS 

Concurrent,  asynchronous 
object  system 

Zetallsp  + 

Zetalisp  plus  locality  and 
communication  cor;tructs 

CARE 

Grid-based,  message-passing 
multiprocessor  specification 

SIMPLE 

Zetalisp 

Hardware  specification  system 
and  event-driven  simulator 

Figure  1-1:  The  software  component 
hierarchy  of  the  experiment. 

The  EL1NT-CAOS-CARE  experiment  investigated  both 
qualitative  and  quantitative  aspects  of  the  performance  of  the 
overall  system.  The  CARE  architecture  uses  dynamic, 
cut-through  (as  opposed  to  store  and  forward)  routing  through 
the  communication  grid  for  interprocessor  message 
transmission.  Message  transmission  time  is  indeterminate.  As 
a  consequence,  without  the  imposition  of  significant  message 
sequencing  protocols  (and  the  corresponding  serialization  of 
execution),  operations  are  intrinsically  non-deterministic  in 
the  sense  that  two  executions  of  the  same  program  on  the 
same  input  data  can  result  in  different  problem  solutions 
depending  on  different  message  arrival  orders.  For  many 
knowledge-based  systems,  in  particular,  the  ELI  NT  system, 
there  is  no  such  thing  as  the  correct  problem  solution  but 
only  satisficing  (i.e„  acceptable)  problem  solutions.  One 
primary  objective  of  the  experiment  was  to  investigate  the 
trade-offs  between  the  imposition  of  various  synchronizations 
(and  the  resulting  loss  of  concurrency)  and  the  quality  of  the 
problem  solution.  A  second  primary  objective  was  the  more 
usual  investigation  of  the  speedup  of  the  overall  system 
performance  as  a  function  of  the  number  of  processing  sites 
in  the  CARE  grid  A  third  objective  was  to  gain  some 
understanding  of  the  difficulties  in  implementing  and 
debugging  a  reasonably  complex  knowledge-based  system  on  a 
multiple  address  space,  message-passing  multiprocessor  system 
such  as  that  represented  by  CARE. 

In  the  following  sections  we  describe,  in  decreasing 
hierarchical  order,  each  component  of  the  experiment. 


'a  version  of  the  SIMPI.F.  simulator  which  runs  on  a  local  area  network  of 
multiple  Lisp  machines  has  also  been  implemented  [4], 


Section  2  describes  the  ELINT  application.  Section  3  gives  an 
overview  the  CAOS  programming  framework  and  its  approach 
to  concurrency.  ELINT’s  implementation  in  CAOS  is 
described  in  Section  4,  and  Section  5  describes  the  salient 
features  of  the  CARE  architecture  and  its  simulation 
environment.  In  Section  6  we  present  the  results  of  the 
ELINT-CAOS-CARE  experiment. 


2.  The  ELINT  Application 

The  driving  application  for  our  vertical  slice  experiment  is  a 
prototype,  knowledge-based  ELINT  system  for  interpreting 
processed,  passively  acquired,  real-time  radar  emissions  from 
aircraft.  This  ELINT  system  is  one  component  of  a 
multi-sensor  information  fusion  system,  TRICERO  [5] 
developed  several  years  ago.  ELINT  was  originally 
implemented  in  AGE  [6],  an  expert  system  development  tool 
based  on  the  blackboard  paradigm  [7,  8],  ELINT  is  a 
relatively  simple,  but  non-trivial,  knowledge-based  system. 
Much  of  its  knowledge  is  implemented  procedurally. 
However,  if  ELINT  had  been  implemented  as  a  production 
rule  system,  we  estimate  that  its  knowledge  base  would  consist 
of  about  one  thousand  rules.2 

ELINT’s  basic  analysis  technique  is  to  correlate  a  large 
number  of  passively  observed  radar  emissions  into  the  smaller 
number  of  individual  radar  emitters  producing  those 
emissions.  It  then  correlates  the  emitters  into  the  yet  smaller 
number  of  clusters  of  co-located  emitters.  ELINT  maintains 
the  track  and  activity  histories  of  the  clusters 


2.1.  ELI  NTs  Inputs 

The  inputs  to  the  ELINT  system  are  multiple,  time-ordered 
streams  of  processed  observations  from  multiple  collection 
sites.  Each  observation  is  presented  in  a  record  format.  The 
fields  of  an  input  observation  record  are  shown  in  Table  2-1. 


Table  2-1: 

Elint  observation  record. 

Field 

Contents 

Observation-Time 

An  integer  time -lag  indicating  when 
the  radar  emission  was  sampled 

Observation-Sile 

The  symbolic  name  of  the  collection 
site  acquiring  the  observation 

Site-Location 

The  positional  coordinates  of  the 
collection  site  at  the  time  of  observation 

Emitter-Identifier 

An  integer  identifing  the  radar  emitter 
producing  the  emission 

Line-of-Bearing 

The  line  of  bearing  from  the  collection 
site  to  the  observed  emitter 

Emitter-Type 

A  symbolic  radar  en  jitcr  type  designator 

Emitter-Mode 

The  operational  mode  of  the  emitter  at 
the  time  of  observation 

Signal-Quality 

A  symbolic  indicator  of  the  signal 
quaiity  of  the  observed  emission 

The  Sitc-lx>cation  field  is  necessary  since  the  collection  sites 
can  be  mobile.  The  Emitter-Identifier  is  a  unique  integer 
identifier  assigned  by  the  collection  sites  to  each  distinct 
observed  emitter.  This  identifier  is  used  by  the  collection 


2ln  general,  there  arc  currently  no  adequate  metrics  for  measuring  the 
complexity  of  knowledge-based  systems.  One  crude  measure  used  for 
rule-based  systems  is  the  number  of  rules.  Although  the  number  of  rules 
does  somewhat  indicate  the  amount  of  knowledge,  it  does  not  give  much 
indication  of  the  complexity  of  the  reasoning. 


sites  to  indicate  multiple  observations  of  the  same  emitter 
both  over  time  and  from  different  collection  sites.  In 
particular,  two  concurrent  observations  of  the  same  emitter 
from  different  collection  sites  should  have  the  same  identifier. 
Both  the  intra-site  and  inter-site  determination  of  whether 
two  observed  emissions  are  from  the  same  emitter  are  based 
on  the  electronic  characteristics  of  the  emissions  and  on 
signature  analysis.  This  determination  may  be  in  error,  and 
the  ELINT  system  must  cope  with  such  identifier  errors.  The 
Emitter-Type  of  a  radar  emitter  indicates  the  functional  class 
of  the  emitter,  for  example,  Air-Intercept  (Al),  Navigation 
(NAV)  or  ldentification-Friend-Or-Foe  (IFF),  and,  if  known, 
the  equipment  type  class  of  the  emitter.  Certain  classes  of 
emitter  types  can  have  multiple  operational  modes.  The 
Emitter-Mode,  if  applicable,  is  emitter-type  specific.  For 
example,  an  AI  radar  can  be  either  in  Search  Mode  or 
Lock-on  Mode  depending  on  whether  it  is  scanning  for  a 
target  or  whether  it  is  automatically  tracking  a  specific  target. 
The  Signal-Quality  of  an  observation  is  a  subjective, 
qualitative  measure  of  the  strength  of  the  observed  emission, 
for  example,  strong,  normal,  or  fading. 

All  of  the  input  information  required  for  the  ELINT  system 
is  obtainable  from  the  raw  radar  signal  data  using  current, 
passive  radar  signal  collection  and  processing  techniques. 
These  techniques  are  largely  automated  and  employ 
special-purpose  hardware. 


2.2.  ELI  NTs  Outputs 

The  primary  outputs  of  the  ELINT  system  are  periodic  status 
reports  about  the  tracks  and  activities  of  clusters  of  emitters 
in  the  area  under  surveillance.  A  cluster  is  defineu  as  a 
collection  of  emitters  which  are  co-located  over  time.  That 
is,  two  emitters  are  in  the  same  cluster  if  for  some  given 
minimum  number  of  corsecutive  time  units  (three  in  the 
current  ELINT  system)  their  corresponding  time-tagged 
locational  fixes  are  within  a  distance  determined  by  the 
line-of-bearing  resolution  of  the  observation  site  equipment 
(one  degree  resolution  in  the  current  ELINT  system). 
Conceptually,  two  emitters  are  in  the  same  cluster  if  if  they 
are  on  the  same  aircraft  or  are  on  two  tactically  associated 
and  co-located  (over  time)  aircraft,  for  example,  a  lead 
aircraft  and  his  wingman.3 

The  periodic  output  reports  contain,  for  each  cluster, 
information  about  the  cluster's  current  heading,  position  and 
track;  an  estimate  of  the  number  and  types  of  aircraft  in  the 
cluster;4  an  indication  of  the  cluster's  current  activity;  and  an 
indication  if  the  cluster  represents  an  immediate  threat,  for 
example,  if  it  is  within  a  certain  proximity  of  a  friendly 
aircraft,  if  its  Al  radar  is  in  Lock-on  Mode,  or  if  its  missile 
guidance  radar  is  on. 


2.3.  FLINT'S  Processing  Flow 

The  basic  reasoning  strategy  used  by  the  ELINT  application  is 
data-driven  accumulation  of  evidence  for  the  existence,  the 

tracks,  and  the  activities  of  emitters  and  clusters  based  on 

input  observations  and  infered  information.  The  primary 
processing  flow  is  a  kind  of  pipeline  where  the  pipeline  stages 
are  observations,  emitters  and  clusters. 

Upon  receipt  of  a  new  observation,  the  system  first 

determines  if  the  observed  emission  matches  (i.e.,  has  as  a 
source)  a  known  emitter  (i.e.,  an  emitter  on  FLINTs 


3An  aircraft  can  be  operating  with  some  (or  all)  of  its  radars  off.  In 
general,  it  is  impossible  to  distinguish  between,  for  example,  two  co-located 
aircraft,  one  with  an  Al  radar  on  and  one  with  a  NAV  radar  on,  and  one 
aircraft  with  both  its  Al  and  NAV  radars  on.  Hence,  our  Fit  NT  system  does 
its  assessments  based  on  emitter  clusters  rather  than  aircraft. 


"situation  board”).  This  match  is  based  on  the 

Emitter-Identifier  assigner  by  the  collection  site  to  the 
observation,  and  it  is  verified  using  the  emitter’s 
characteristics  and  its  track  and  heading  histories.  Depending 
on  the  outcome  of  the  match,  one  of  the  following  actions  is 
taken; 

1.  If  the  observation  does  not  match  a  known 
emitter,  then  a  new  emitter  which  is  the  source  of 
the  observed  emission  is  hypothesized  on  the 
situation  board  and  initialized  from  the 
information  contained  in  the  observation. 

2.  If  the  observation  does  match  an  emitter  on  the 
situation  board  and  the  match  is  verified,  then  the 
information  contained  in  the  observation  is  used 
to  update  the  attributes  of  the  matched  emitter, 
including  increasing  the  confidence  level  of  the 
hypothesis  that  the  emitter  represents.  Moreover, 
if  the  new  observation  is  the  second  (or  greater) 
observation  of  the  emitter  for  the  current  time  and 
it  is  from  a  different  collection  site  than  the 
previous  observation(s)  at  that  time,  then  a 
locational  fix  for  the  emitter  is  computed  using 
the  observed  lines  of  bearing.  If,  in  addition,  the 
Emitter-Type  and/or  Emitter-Mode  indicate  a 
near-term  threat  to  a  friendly  aircraft,  then  a 
threat  report  is  output. 

3.  If  the  observation  matches  a  known  emitter  but 
fails  the  match  verification  test,  then  an  error  in 
the  Emitter-Identifier  is  indicated  and  the  situation 
board  is  modified  so  as  to  undo  any  incorrect 
inferences  based  on  the  error.  Also,  an  identifier 
error  report  is  output  to  the  collection  sites. 

On  a  periodic  basis,  the  status  of  each  emitter  on  the 
situation  board  is  evaluated  and  various  actions  are  taken: 

1.  If  there  have  been  no  recent  observations  of  the 
emitter,  then  the  confidence  level  of  the  emitter  is 
reduced.  If,  as  a  consequence  of  this  reduction, 
mat  level  falls  below  a  given  no-confidence 
threshold,  then  the  emitter  and  all  of  the 
consequences  infered  from  it  (including  cluster 
association)  are  deleted  from  the  situation  board. 

2.  If  the  confidence  level  is  above  a  given 
full-confidence  threshold  and  the  emitter  is  not 
currently  associated  with  a  known  cluster,  then  an 
attempt  is  made  to  match  the  emitter  with  a 
cluster  on  the  situation  board.  This  match  is 
based  on  the  track  and  heading  histories  and  the 
type  attributes  of  the  emitter  and  the  cluster.  If  a 
match  is  made,  then  the  emitter  is  associated  with 
the  matched  cluster  and  the  emitter’s  current 
attributes  are  used  to  update  the  attributes  of  the 
cluster.  If  the  match  fails,  then  a  new  cluster  is 
hypothesized  on  the  situation  board  and  the 
emitter  is  associated  with  it. 

3.  In  the  remaining  case  of  a  recently  observed 
emitter  with  an  associated  cluster,  the  current 
attributes  of  the  emitter  are  used  to  update  the 
attributes  of  its  associated  cluster. 

Also  on  a  periodic  basis,  the  state  of  each  hypothesized  cluster 
on  the  situation  board  is  examined.  If  all  of  the  enntiers 
associated  with  the  cluster  have  been  deleted,  then  the  cluster 
is  deleted  from  the  situation  board.  Otherwise: 


4Knowledge  relating  an  aircraft  type,  for  example  F-15  or  MIG-1,  with  the 
number  and  types  of  radars  it  carries  is  available.  Using  this  knowledge  and 
the  identified  emitter  types  in  a  cluster,  it  is  possible  to  roughl  estimate 
bounds  on  the  number  and  types  of  aircraft  in  the  cluster. 
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1.  The  cluster  is  checked  to  see  if  it  should  be  spl 
into  two  (or  more)  clusters  based  on  the  currre. 
locations  of  its  associated  emitters.  If  so,  ne> 
clusters  with  the  appropriate  associated  emitters  are 
hypothesized  on  the  situation  board. 

2.  The  track  history,  heading  history,  speed  history 
and  activity  history  of  the  cluster  are  updated;  and, 
if  any  new  emitters  have  been  recently  associated 
with  the  cluster,  an  estimate  of  the  types  and 
numbers  of  aircraft  comprising  the  cluster  is 
derived. 

3.  A  current  status  report  for  the  cluster  is  output. 

The  EL1NT  processing  flow  lends  itself  naturally  to 
concurrent  execution.  The  parallel  implementation  of  EL1NT 
using  CAOS  is  described  in  Section  4.  The  CAOS  system 
itself  is  described  in  the  following  section. 


3.  The  CAOS  Programming  Framework 

CAOS  is  a  framework  which  supports  the  encoding  and  the 
execution  of  multiprocessor  expert  systems.  It  represents  an 
early  attempt  to  bridge  the  gap  between  the  application 
specification  and  the  multiprocessor  system  programming 
primitives.  The  design  of  CAOS  is  predicated  on  the  belief 
that  many  highly  parallel  architectures  (e.g.,  hundreds  of 
processors)  will  emphasize  limited  communication  between 
processor-memory  pairs  rather  than  uniformly  shared  memory. 
We  expect  that  such  an  architecture  will  favor  relatively 
coarse-grained  problem  decomposition  with  little 
synchronization  between  processors.  CAOS  is  intended  for 
use  in  real-time,  data  interpretation  applications  such  as 
continuous  speech  recognition  and  radar  and  Sofiat  signal 
interpretation  (see,  for  example,  [9,  10]).  CAOS  is  based  on 
sir  had  pf  igr*  ilh  l-  ^  { iiBkdijjm  and  it  draws  manv 

of  its  ideas  from  the  Flavors  system  [1]  and  the  Actors 
paradigm  [11]. 

A  CAOS  application  consists  of  a  collection  of 
communicating,  active  agents,  each  responding  to  a  number  of 
application-dependent,  predeclared  messages.  An  agent  retains 
long-term  local  state.  Each  agent  is  a  multi-process  entity, 
that  is,  an  arbitrary  number  of  processes  may  be  active  at  any 

one  time  in  a  single  agent.5  Conceptually,  an  agent  can  be 
thought  of  as  virtual,  multiprocess  processor  and  memory  pair. 
It  rtspvinds  tu  externally  se-ui  messages,  and  these  message 
responses  can  alter  the  state  of  its  local  memory  and  can 
include  the  sending  of  messages  to  other  agents. 
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grain-size.  For  example,  in  the  ELINT  experiment,  the 
message  handlers  (i.e.,  the  methods)  which  implement  the 
message  responses  are  written  as  Lisp  procedures,  each 
averaging  about  one  hundred  lines  of  primitive  Lisp  code. 
CAOS  supports  no  mechanism  for  finer-grained  concurrency 
such  as  within  the  execution  of  agent  processes,  but  neither 
does  it  rule  it  out.  We  could  easily  imagine  message  methods 
being  written,  for  example,  in  QLisp  [12],  a  concurrent 
dialect  of  CommonLisp  which  supports  finer-grained 
concurrency. 


3.1.  CAOS’  Approach  to  Concurrency 

A  CAOS  application  is  structured  to  achieve  high  degrees  of 
concurrency  in  the  application  execution  in  two  principal 
manners:  pipelining  and  replication.  Pipelining  is  most 
appropriate  for  representing  the  flow  of  information  between 
levels  of  abstraction  in  an  interpretation  system.  Replication 


5Thc  active  processes  in  an  agent  arc  not  scheduled  preemptively,  Instead, 
an  executing  agent  process  e  Iher  runs  to  completion  or  until  it  is  blocked 
awaiting  some  remote  service  (see  Section  5). 


provides  means  by  which  the  interpretation  system  can  cope 
with  arbitrarily  high  data  rates. 

3.1.1.  Pipelining 

Pipelining  is  a  common  means  of  parallelizing  tasks  through  a 
decomposition  into  a  linear  sequence  of  concurrently 
operating  stages.  Each  stage  is  assigned  to  a  separate 
processing  unit  which  receives  the  output  from  the  previous 
stage  and  provides  input  to  the  next  stage.  Optimally,  when 
the  pipeline  reaches  a  steady-state,  each  of  the  processors  is 
busy  performing  its  assigned  stage  of  the  overall  task. 

CAOS  promotes  the  use  of  pipelines  to  partition  an 
interpretation  task  into  a  sequence  of  interpretation  stages 
where  each  stage  of  the  interpretation  is  performed  by  a 
separate  agent.  As  data  enters  one  agent  in  the  pipeline,  it  is 
processed,  and  the  results  are  sent  to  the  next  agent.  The  data 
input  '.o  each  successive  stage  represents  a  higher  level  of 
abstraction. 

Sequential  decomposition  of  a  large  task  is  frequently  very 
natural.  Structures  as  disparate  as  manufacturing  assembly 
lines  and  the  arithmetic  processors  of  high-speed  computing 
systems  are  frequently  based  on  this  paradigm. 

Pipelining  provides  a  mechanism  whereby  concurrency  is 
obtained  without  duplication  of  mechanism  (i.e.,  machinery, 
processing  hardware,  knowledge,  etc).  In  an  optimal  pipeline 
of  n  processing  elements,  the  throughput  of  the  pipeline  is  n 
times  the  throughput  of  a  single  processing  element  in  the 
pipeline. 

Unfortunately,  it  is  often  the  case  that  a  task  cannot  be 
decomposed  into  a  simple  linear  sequence  of  subtasks.  Some 
stage  of  the  sequence  may  depend  not  only  on  the  results  of 
its  immediate  predecessor,  but  also  on  the  results  of  more 
distani  predecessors,  or  wotse,  some  distant  successor  (e.g.,  ih 
feedback  loops).  An  equally  disadvantageous  decomposition  is 
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more  time  than  others.  The  effect  of  either  of  these 
conditions  is  to  cause  the  pipeline  to  be  used  less  efficiently. 
Both  these  conditions  may  cause  some  processing  stages  to  be 
busier  than  others.  In  the  worst  case,  some  stages  may  be  so 
busy  that  other  stages  receive  almost  no  work  at  all.  As  a 
result,  the  n-element  pipeline  achieves  less  than  an  n-times 
increase  in  throughput.  We  discuss  a  partial  remedy  for  this 
situation  below. 


3.1.2.  Ucplicativii 

Concurrency  gained  through  replication  is  ideally  orthogonal 
to  concurrency  gained  through  pipelining.  Any  size 
utJGn*i%:  UinctuM  fr  afl.  *11  urtflirhSoaT  /fci'Cwug  x Cerium  l 
an  entire  pipeline,  is  a  candidate  for  replication.  Consider  a 
task  which  must  be  performed  on  the  average  in  time  t,  and  a 
processing  structure  which  is  able  to  perform  the  task  in  time 
T,  where  T  >  t.  If  this  task  were  actually  a  single  stage  in  a 
larger  pipeline,  this  stage  would  then  be  a  bottleneck  in  the 
throughput  of  the  pipeline.  However,  if  the  single  processing 
structure  which  performed  the  task  were  replaced  by  T/t 
copies  of  the  same  processing  structure,  the  effective  time  to 
perform  the  task  would  approach  t,  as  required.  Replication 
is  more  costly  than  pipelining,  but  it  does  avoid  some  of  the 
problems  associated  with  developing  a  pipelined 
decomposition  of  a  task. 

Our  work  leads  us  to  believe  that  such  replicated  computing 
structures  are  feasible,  but  not  without  drawbacks.  Just  as 
performance  gains  in  pipelines  are  impacted  by  inter-stage 
dependencies,  performance  gains  in  replicated  structures  are 
impacted  by  inter-structure  dependencies. 

Consider  a  system  composed  of  a  number  of  copies  of  a 
single  pipeline.  Further,  assume  the  actions  of  a  particular 
stage  in  the  pipeline  affects  each  copy  of  itself  in  the  other 
pipelines.  In  an  expert  system,  for  example,  a  number  of 
independent  pieces  of  evidence  may  cause  the  system  to  draw 
the  same  conclusion.  The  system  designer  may  require  that 


when  a  conclusion  is  arrived  at  independently  by  different 
means,  some  measure  of  confidence  in  the  conclusion  is 
increased  accordingly.  If  the  inference  mechanism  which 
produces  these  conclusions  is  realized  as  concurrently 
operating  copies  of  a  single  inference  engine,  the  individual 
inference  engines  will  have  to  communicate  between 
themselves  to  avoid  producing  multiple  copies  of  the  same 
conclusion  rather  than  a  composite  conclusion.  Any 
consistency  requirement  between  copies  of  a  processing 
structure  decreases  the  throughput  of  the  entire  system,  since  a 
portion  of  the  system's  work  is  dedicated  to  inter-system 
communication.  Examples  of  this  situation  are  shown  in 
Section  4  where  we  describe  the  CAOS  agent  types  for  the 
EL1NT  application. 


3.2.  Programming  in  CAOS 

CAOS  is  basically  a  package  of  operators  on  top  of  Lisp. 
These  operators  are  partitioned  into  three  major  classes 
--  those  which  decla  •  agent  classes,  those  which  initialize 
agents,  and  those  which  support  communication  between 
agents.  We  now  describe  briefly  the  CAOS  operators  for  each 
of  these  classes.  A  more  complete  description  of  these 
operators  is  given  in  [13]. 

3.2.1.  Declaration  of  Agents 

Agents  classes,  like  most  object-oriented  classes,  are  declared 
within  an  inheritance  network.  Each  agent  class  inherits  the 
attributes  of  its  (multiple)  parents.  The  root  CAOS  agent 
class,  vanilla-agent,  contains  the  minimal  attributes  required  of 
a  functional  CAOS  agent.  All  other  CAOS  agents  have  the 
vanilla-agent  as  a  parent,  either  directly  or  indirectly. 
Another  CAOS-declared  agent  class,  process-agenda-agcnt,  is  a 
specialization  of  vanilla-agent,  and  includes  a  priority 
mechanism  for  scheduling  the  execution  of  messages.  The 
vanilla-agent  schedules  its  messages  in  a  FIFO  manner  only. 

Application  agent  classes  are  declared  by  augmenting  the 
following  primary  attributes  of  CAOS-iieclared  or  other 
ancestral  agent  classes: 

Ijoeal- Variables:  An  instance  agent's  local  variables  store  its 
private  slate.  The  agent's  message  handlers  may  refer  freely 
to  only  those  variables  declared  locally  within  the  agent. 
Each  local  variable  may  be  declared  with  an  initial  value. 

Messages-Methods:  The  only  messages  to  which  an  agent  may 
respond  are  those  declared  in  the  agent’s  class  declaration. 
Associated  with  each  declared  message  name  is  the  name  of 
the  message’s  method  (i.e.,  the  message's  message  handler).  In 
CAOS,  a  method  name  must  refer  to  a  defined  Lisp 
procedure.  This  declaration  simplifies  the  task  of  a  resource 
allocator  which  must  load  application  code  onto  each  CARE 
site. 

Clocks-Methods:  An  agent  may  periodically  invoke  actions 
based  on  internal  clock  "ticks."  For  example,  the  periodic 
update  of  emitter  agents  and  the  periodic  output  of  cluster 
status  reports  are  invoked  by  clock  ticks.  A  clock  is  defined 
by  ns  tick  interval.  Whenever  an  internal  agent  clock  licks, 
the  set  of  methods  associated  with  that  clock  are  scheduled  for 
execution. 

Critical- Methods:  This  attribute  declares  certain  sets  of 
methods  as  being  mutually  "critical  regions"  for  their  owning 
agents.6  Each  such  set  of  critical  methods  has  an  associated 
lock  Before  an  owning  agent  agent  executes  a  critical 
method,  this  lock  is  checked.  If  it  is  unlocked,  the  agent 
locks  it  and  executes  the  method.  Upon  completion  of  the 
method,  the  agent  unlocks  the  lock.  If  the  lock  is  locked,  the 
method  is  queued  in  a  FIFO  queue  awaiting  the  unlocking  of 
the  lock 


6A  design  goal  for  FLINT  in  CAOS  was  to  avoid  the  use  of  critical 
methods,  and  our  FLINT  implementation  does  not  use  any.  The  CAOS 
initialization  routines,  however,  do  use  such  methods. 


There  are  a  number  of  additional  basic  agent  attributes. 
However,  most  of  these  are  used  only  internally  by  CAOS. 

3.2.2.  Initialization  of  agents 

An  initial  CAOS  configuration  is  specified  by  a 
two-component  initialization  form.  The  first  component  of 
the  form  creates  the  static  agent  instances.  Some  agent 
instances  are  created  during  system  initialization  and  exist 
throughout  a  CAOS  run.  Such  agent  instances  are  called  static 
agents  as  opposed  to  dynamic  agents  which  are  created  (and 
possibly  deleted)  during  program  execution.  For  programmer 
convenience,  we  allow  code  in  agent  message  handlers  and 
default  values  of  local-variables  to  reference  such  static  agents 
by  name.  Before  an  agent  instance  begins  running,  each 
symbolic  reference  to  the  declared  static  agents  is  resolved  by 
the  CAOS  runtimes. 

The  second  component  of  the  form  is  a  list  of  expressions  to 
be  evaluated  sequentially  when  CAOS's  static  agent 
instantiation  phase  is  complete.  Each  expression  is  intended 
to  send  a  message  to  one  of  the  static  agents  declared  in  the 
first  part  of  the  form  These  messages  serve  to  initialize  the 
applicat  on.  For  example,  in  the  EL1NT  application  the 
initialization  messages  open  log  files  and  start  the  processing 
of  EL1NT  observations. 

Agent  instances  may  also  be  created  dynamically  during 
execution.  The  creation  operator  accepts  an  agent  class  name 
and  a  location  specification.7  The  remote-address  of  the 
newly-created  agent  instance  is  returned.  The  remote-address 
of  an  agent  includes  the  CARE  site  coordinates  where  the 
agent  resides  and  a  pointer  to  the  agent  in  the  address  space 
of  that  site.  A  dynamically  created  agent  may  not  be 
referenced  symbolically,  however,  its  remote-address  may  be 
exchanged  freely. 

3.2.3.  Communications  Between  Agents 

Agents  communicate  with  each  other  by  exchanging  messages. 
CAOS  does  not  guarantee  when  messages  reach  their 
destinations.  Due  to  excessive  message  traffic  or  processing 
element  failure,  messages  may  be  delayed  indefinitely  during 
routing.  It  is  the  responsibility  of  the  application  program  to 
detect  and  recover  from  such  delayed  messages. 

Two  classes  of  messages  are  defined:  those  which  return 
values,  called  value-desired  messages,  and  those  which  do  not, 
called  side-effect  messages.  The  value-desired  messages  are 
made  to  return  their  values  to  a  special  cell  called  a  future 
which  represents  a  "promise”  for  an  eventual  value.8  Processes 
attempting  to  access  the  value  of  a  future  are  blocked  until 
that  future  has  had  its  value  set.  Futures  are  first-class  data 
types,  and  they  may  be  manipulated  by  non-strict  Lisp 
operators  (eg.,  list)  even  if  they  have  not  yet  received  a  value. 
It  is  possible  for  the  value  of  a  CAOS  future  to  be  set  more 
than  once,  and  it  is  possihle  for  there  to  be  multiple  processes 
awaiting  a  future’s  value  to  be  set. 

The  CARF  primitive  post-packet,  which  sends  a  packet  from 
one  process  to  another,  is  employed  in  CAOS  to  produce  three 
basic  kinds  of  message  sending  operations: 

post:  The  post  operator  sends  a  side-effect  message  to  an 
agent.  The  sending  process  rupplies  a  remote-address  to  the 
target  agent  (or  its  name  in  Tie  case  of  a  static  agent),  the 
message's  routing  priority,  a  id  the  message’s  name  and 
arguments.  The  sender  continues  executing  while  the  message 
is  delivered  to  the  target  agent. 


7Currenlly,  agents  may  be  created  only  "at"  or  "near"  specified  CARF  sites. 
CAOS  makes  no  attempt  at  dynamic  load  balancing 

8Futures  are  also  used  m  Mulnlisp  [14].  The  T1FP  Supercomputer  f  I S J 
implemented  a  simple  version  of  futures  as  a  process  synchronization 
mechanism. 


post-future:  The  post-future  operator  sends  a  value-desired 
message  to  the  target  agent.  The  sending  process  supplies  the 
same  parameters  as  for  post,  and  it  is  immediately  returned  a 
local  pointer  to  the  future  which  will  eventually  receive  a 
value  from  the  target  agent.  As  for  post,  the  sender  continues 
executing  while  the  message  is  being  delivered  and  executed 
remotely  A  process  may  later  check  the  state  of  the  future 
with  the  future-satisfied?  operator  or  access  the  future's  value 
with  the  value-future  operator.  This  latter  operator  will  block 
the  process  (i.e.,  suspend  its  execution  ar.d  "swap  it  out")  if 
the  future  has  not  yet  received  a  value.  When  the  future 
finally  receives  a  value,  the  blocked  process  is  rescheduled  for 
resumed  execution 

post  value-  The  post-value  operator  is  similar  to  the 
post  future  operator  except  that  the  sending  process  is 
immediately  blocked  until  the  target  agent  has  returned  a 
value  This  operator  is  defined  in  terms  of  post-future  and 
value-future,  and  it  is  provided  for  programming  convenience. 

It  is  possible  to  detect  delay  of  value-desired  messages  by 
attaching  a  timeout  to  the  associated  future.  The  operators 
post-clocked-future  and  post-clocked-value  are  similar  to  their 
untimed  counterparts  but  allow  the  caller  to  specify  a 
timeout-period  and  timeout-action  to  be  performed  if  the 
future  is  not  set  within  the  timeout-period.  Typical 
timeout-actions  include  setting  the  future's  value  to  a  default 
value  or  resending  the  original  message  using  the  repost 
operator. 

There  also  exist  versions  of  the  basic  posting  operators  which 
allow  the  same  message  to  be  sent  to  multiple  agents 
simultaneously.  These  versions  exploit  the  multicast  facilities 
of  CARF.  (see  Section  5).9 

Multipost  sends  a  side-effect  message  to  a  list  of  agents  while 
multipost-future  and  multipost-value  send  value-desired 
messages  to  lists  of  agents.  In  the  latter  two  cases,  the 
associated  future  is  actually  a  list  of  futures,  and  the  future  is 
not  considered  satisfied  until  all  the  target  agents  have 
responded.  The  value  of  such  a  message  is  an  association-list 
where  each  entry  in  the  list  is  composed  of  an  agent's 
remole-address  or  name  and  the  returned  message  value  from 
that  agent.  There  exist  clocked  versions  of  these  operators 
(called,  naturally,  multipost-clocked-future  and 
multipost-clocked-value)  to  aid  in  detecting  delayed  multicast 
messages. 


3.3.  The  Runtime  Structure  of  CAOS 

CAOS  is  structured  around  three  principal  levels:  site,  agent, 
and  process.  Two  of  these  levels,  site  and  process,  reflect  the 
organization  of  CARE.  The  remaining  agent  level  is  an 
arttfact  of  CAOS.  We  describe  here  only  briefly  the  runtime 
structure  of  CAOS.  This  structure  is  described  in  greater 
detatl  in  [13], 

The  implementation  of  CAOS  described  in  this  report  is 
written  m  Zetalisp  [1]  and  the  primitive  CARE  operators 
using  Zetalisp's  object-oriented  programming  tool,  Flavors[l], 

Each  CARE  site  contains  a  CAOS  Site-Manager.  A 
Site-Manager  is  realized  as  a  Flavors  instance.  Its  instance 
variables  store  site-global  information  needed  by  all  agents 
located  on  the  site.  In  addition,  each  Site-Manager  includes 
CARF.-level  processes  which  perform  the  functions  of  creating 
new  agents  on  its  site  and  translating  static  agent  symbolic 
names  into  agent  addresses. 

Each  CAOS  agent  is  also  realized  as  a  Flavors  instance.  A 
CAOS  agent  is  a  multiprocess  entity.  Mc„t  of  the  processes 


^Neither  CAOS  nor  CARF.  currently  support  a  ''predicated  multicast''  mode 
wherein  messages  would  be  sent  to  all  agents  satisfying  a  particular  predicate. 
Messages  an  only  be  multicast  to  a  fully-specified  list  of  agents.  Receiving 
agents  can.  of  course,  apply  arbitrary  predicates  to  the  message  in  order  lo 
determine  the  r  consequent  action. 


are  created  in  the  course  of  problem-solving  activity.  These 
processes  are  refered  to  as  user  processes.  At  runtime, 
however,  there  are  always  two  special  processes  associated  with 
each  CAOS  agent  —  the  agent  input  monitor  process  and  the 
agent  scheduler  process.  The  agent  input  monitor  process 
watches  the  CARE  stream  by  which  the  agent  is  known  to 
other  agents.  It  handles  request  messages  and  responses  from 
value-desired  messages  from  these  agents.  CAOS  user 
processes  are  created  in  response  to  request  messages  from 
other  agents  or  clocked  methods.  The  agent  scheduler  process 
collaborates  with  the  CARE  site’s  operator  processor  in  the 
scheduling  c»'  these  user  processes  (see  Section  5). 


4.  ELINT’s  Implementation  in  CAOS 

We  describe  now  the  agent  types  and  their  organization  for 
the  ELINT  application  as  implemented  in  the  CAOS 
framework.  This  implementation  illustrates  some  of  the 
benefits  and  some  of  the  drawbacks  of  the  framework.  As 
discussed  in  Section  2,  ELINT  is  an  expert  system  whose 
domain  is  the  interpretation  of  passively-observed  radar 
emissions.  ELINT  is  meant  to  operate  in  real  time.  Emitters 
appear  and  disappear  during  the  lifetime  of  an  ELINT  run. 
The  primary  flow  of  information  in  ELINT  as  implemented 
in  CAOS  is  through  a  pipeline  with  replicated  stages.  Each 
stage  in  the  pipeline  is  an  agent.  The  basic  ELINT  agent 
pipeline  is  illustrated  in  Figure  4-1. 


Figure  4-1:  The  basic  ELINT  agent  processing  pipeline. 


4.1.  FLINT  Agent  Types 

The  ELINT  agent  types  described  here  are  those  used  by  the 
CT  control  strategy  version  of  ELINT  in  CAOS  (see  Section 
6). 

Observation-Reader  Agent 

Observation-reader  agents  are  an  artifact  of  the  simulated 
environment  in  which  w  ELINT  implementation  runs. 
Their  purpose  is  to  feed  radar  observations  into  the  system. 
Observation-readers  are  driven  off  system  clocks.  At  each 
clock  "tick”  (one  ELINT  time  unit),  they  supply  all 
observations  for  the  associated  time  interval  to  the  proper 
observation -handler  agents.  This  behavior  is  similar  to  that 
of  radar  collection  sites  in  an  actual  ELINT  setting. 

Observation -Handler  Agent 

The  observation-handler  agents  accept  radar  observations  from 
associated  radar  collection  sites.  Of  course,  in  the  simulated 
environment  the  observations  actually  come  from 
observation-reader  agents.  There  may  be  several 
observation-handlers  associated  with  each  collection  site.  The 
collection  site  chooses  to  which  of  its  observation-handlers  to 
pass  an  observation  based  on  some  scheduling  criteria,  for 
example,  round-robin. 

The  contents  of  an  ELINT  observation  was  described  in 
Section  2.  In  particular,  each  observation  contains  an 
identifier  number  assigned  by  the  collection  site  to  distinguish 
the  source  of  the  observation  from  other  known  sources.  This 
source  identifier  is  usually,  but  not  always,  correct.  When  an 
observation-handler  receives  an  observation,  it  checks  the 
observation's  identifier  to  see  if  it  already  knows  about  the 
emitter  which  is  the  observation's  source.  If  it  does  it  passes 
the  observation  to  the  appropriate  emitter  agent  which 
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represents  the  observation’s  source.  If  the  observation-handler 
does  not  know  about  the  emitter,  it  asks  an  emitter-manager 
agent  to  create  a  new  emitter  agent  and  then  passes  the 
observation  to  that  new  agent. 

Emitter-Manager  Agent 

There  may  be  many  emitter-manager  agents  in  the  system. 
An  emitter-manager's  task  is  to  respond  to  requests  from 
observation -handlers  to  create  new  emitter  agents  with 
associated  source  identifier  numbers.  If  there  is  no  such 
emitter  agent  in  existence  when  the  request  is  received,  the 
manager  will  create  one  and  return  its  remote-address  to  the 
requesting  observation-hardier  agent.  If  there  is  such  an 
emitter  a°?nt  in  exigence  when  the  request  is  received,  the 
manager  will  simply  return  its  remote-address  to  the 
requestor.  This  situation  arises  when  one  observation-handler 
requests  an  emitter  that  another  observation-handler  had 
previously  requested.  Emitter-managers  must  also  handle  51k 
case  of  "almost  concurrent"  requests  for  the  sane  emitter. 
This  case  occurs  when  a  request  is  received  for  an  emitter 
agent  which  is  currently  being  created  by  another  process  on 
another  CARE  site  in  response  to  a  slightly  earlier  request. 

The  reason  for  the  emitter-manager’s  existence  is  tc  reduce 
the  amount  of  inter-pipeline  dependency  with  respect  to  the 
creation  of  emitters.  When  EL1NT  creates  an  emitter  it  is 
similar  to  a  typical  expert  system  drawing  a  conclusion  based 
on  some  evidence.  ELINT  must  create  its  emitters  in  suen  a 
way  that  the  individual  observation-handlers  do  not  each  end 
up  creating  copies  of  the  "same”  emitter,  that  is,  creating 
multiple  emitter  agents  with  the  same  associated  source 
identifier  (see  Section  3.1.2).  Consider  the  following  strategies 
that  the  observation-handler  agents  could  use  to  create  new 
emitter  agents: 

1.  The  handlers  could  create  the  emitter  agents 

themseLes  immediately  as  needed.  Since  the 

collection  sites  may  pass  observations  with  the 
same  source  identifier  to  any  observation-handler, 
it  is  possible  for  multiple  observation-handlers  to 
each  create  its  own  copy  of  the  same  emitter. 

This  strategy  is  not  acceptable. 

2.  The  handlers  could  create  the  emitter  agents 

themselves,  but  inform  the  other  handlers  that  they 
have  done  this.  This  scheme  breaks  down  when 
two  handlers  try  simultaneously  (or  almost 

simultaneously)  to  create  the  same  emitter. 

3.  The  handlers  could  rely  on  a  single 

emitter-manager  agent  to  create  all  emitters. 

While  this  approach  is  safe  from  a  consistency 
standpoint,  it  is  likely  to  be  impractical  as  the 
single  emitter-manager  could  become  a  processing 
bottleneck. 

4.  The  handlers  could  send  requests  to  one  of  many 

emitter-managers  chosen  by  some  arbitrary 

method.  This  idea  is  nearly  correct,  but  does  not 
rule  out  the  possibility  of  two  emitter-managers 
each  receiving  creation  requests  for  the  same 
emitter. 

5.  The  handlers  could  send  requests  to  one  of  many 
emitter-managers  chosen  through  some  algorithm 
which  is  invariant  with  respect  to  the  source 
identifiers. 

This  last  strategy  is  the  one  used  used  in  our  implementation 
of  FLINT.  The  algorithm  for  choosing  which 
emitter-manager  to  use  is  based  on  a  many-to-one  mapping 
of  source  identifiers  to  emitter-managers.10 


10The  algorithm  simply  compules  Ite  source  identifier  modulo  the  number 
of  emitter  managers  and  maps  that  n*.  iber  to  a  particular  manager 


Emitter  Agent 

Emitter  agents  hold  the  state  and  history  of  the  observation 
sources  they  represent  As  each  new  observation  is  received 
by  an  emitter  agent,  it  is  added  to  a  list  of  new  observations. 
On  a  periodic  basis,  this  list  of  new  observations  is  scanned 
for  interesting  information.  In  particular,  after  enough 
observat  .  ns  are  received,  the  emitter  may  be  able  to 
determine  the  heading,  speed,  and  location  of  the  source  it 
represents.  The  first  time  it  is  able  to  determine  this 
information,  it  asks  a  cluster-manager  agent  to  either  match 
the  emitter  to  an  existing  cluster  agent  (as  described  in 
section  2.3)  or  create  a  new  cluster  agent  to  hold  the  single 
emitter.  Subsequently,  it  sends  an  update  message  to  the 
cluster  agent  to  which  it  is  associated  indicating  its  current 
heading,  speed,  and  location. 

Emitters  maintain  a  gjiaimxtiv*  confidence  ievel  of  their  own 
exKerce  <>te,  probab  e,  positive  and  was* positive).  If 

new  observations  are  received  often  enough,  the  emitter  will 
increase  its  confidence  level  until  it  reaches  positive.  If  an 
observation  is  not  received  by  an  emitter  in  the  expected  time 
interval,  the  emitter  lowers  its  confidence  by  one  step.  If  the 
confidence  falls  below  possible,  the  emitter  deletes  itself, 
informing  its  manager  and  any  cluster  to  which  it  is 
associated  of  its  deletion. 

Cluster-Manager  Agent 

The  cluster-manager  agents  play  much  the  same  role  in  the 
creation  of  cluster  agents  as  the  emitter-manager  agents  play 
in  the  creation  of  emitter  agents.  However,  it  is  not  possible 
to  compute  an  invariant  to  be  used  for  a  many-to-one 
mapping  between  emitters  and  cluster  managers.  If  ELINT 
were  to  employ  multiple  cluster-managers,  any  strategy  for 
which  of  the  man;,  managers  an  emitter  agent  chooses  to 
request  a  cluster  match  could  still  result  in  the  creation  of 
multiple  instances  of  the  "same"  cluster  (i.e.,  multiple  cluster 
agents  representing  the  same  physical  cluster  of  emitters). 
Thus,  we  have  chosen  to  implement  ELINT  using  only  a 
single  cluster-manager.  Fortunately,  new  cluster  creation  is  a 
relatively  rare  event,  and  the  single  cluster-manager  has  never 
been  observed  to  be  a  processing  bottleneck. 

As  described  above,  requests  from  emitters  to  associate 
themselves  with  clusters  are  specified  as  match  requests  over 
the  extant  clusters.  Emitters  are  matched  to  clusters  on  the 
basis  of  their  location,  speed,  and  heading  histories.  However, 
the  cluster-manager  does  not  itself  perform  this  mat.hng 
operation.  Although  it  knows  about  the  existence  of  '"ach 
cluster  it  has  created,  it  does  not  know  about  the  curren,  state 
of  those  clusters.  Thus,  the  cluster-manager  asks  all  of  its 
clusters  to  (concurrently)  perform  a  match. 

If  none  of  the  clusters  responds  with  a  positive  match,  the 
cluster-manager  cr-ates  a  new  cluster  for  the  emitter.  If  one 
cluster  responds  positively,  the  emitter  is  added  to  the  cluster 
and  it  is  so  informed  of  this  fact.  If  more  than  one  cluster 
responds  positively,  this  usually  indicates  that  there  is  not  yet 
sufficient  resolution  of  the  emitter's  history  to  uniquely 
associate  it  with  a  cluster  In  this  case  the  emitter  to  cluster 
matching  operation  is  tried  again  after  more  observations  of 
the  emitter  have  been  processed. 

Cluster  Agent 

The  radar  emissions  from  a  cluster  of  emitters  often  indicate 
the  activities  of  the  aircraft  represented  by  that  cluster.  For 
example,  emissions  from  a  missile  guidance  radar  indicate  that 
an  air-to-air  attack  is  imminent.  Each  cluster  agent 
periodically  applies  heuristics  about  types  of  radar  signals  to 
try  to  determine  the  current  activities  of  its  represented 
aircraft,  and,  in  particular,  if  these  activities  represent  a  threat 
to  friendly  aircraft  This  activity  information,  the  aircraft 
type  information,  and  the  merged  track  parameters  of  the 
emitters  associated  with  each  cluster  are  the  primary  outputs 
of  the  ELINT  system.  Also,  each  cluster  periodically  checks 
to  see  if  all  constituent  emitters  have  been  deleted  If  so,  it 
deletes  itself. 
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Time-Manager  Agent 

Many  of  the  knowledge-based  actions  taken  by  an  ELINT 
agent  make  use  of  the  agent’s  last-observed  time,  that  is,  the 
time  stamp  of  the  most  recent  observation  associated  directly 
or  indirectly  with  the  agent.  For  example,  if  an  emitter  agent 
determines  that  it  has  received  no  new  associated  observations 
for  several  data  time  intervals  (i.e.,  that  it  is  ”out-of-date'’),  it 
will  consider  itself  as  no  longer  exisiting  and  it  will  delete 
Uself  and  all  of  its  relational  links  from  ELINT’s  situation 
board.11 

In  an  asynchronous  message  passing  system  such  as  C  .RE,  it 
is  difficult  for  an  agent  to  determine  whether  it  is 
out-of-date  because  it  has  not  been  observed  recently  0r 
because  messages  to  it  which  would  result  in  an  update  of  its 
last-observed  time  are  delayed  due  to  overall  system  load  or 
local  load  imbalances.  One  solution  to  this  problem  would  be 
for  each  observation-handler  agent  to  send  an 
"end-of-observation-time-interval”  message  to  each  of  ;ts 
known  emitter  agents  whenever  it  observes  the  crossing  of  an 
observation  time  interval  boundary.12 

This  solution  was  rejected  for  the  reported  implementation  of 
EL.INT  because  of  a  perceived  excessive  message  overhead.13 
Instead,  our  ELINT  experiment  uses  a  time  manager  agent. 
Whenever  an  observation-handler  agent  observes  a  new  input 
observation  time  stamp,  it  reports  this  new  time  to  the 
time-manager  via  a  message.  The  time-manager  maintains  a 
conservative,  global  current  observation  time  which  is  the 
minimum  of  the  the  reported  time  stamps.  Whenever  any 
agent  considers  taking  a  drastic,  non-reversible  action  which 
is  based  on  its  being  out-of-date  (eg.,  deleting  itself),  it 
requests  a  confirmation  from  the  time-manager  that  its  (the 
requesting  agent's)  last-observed  time  is  sufficiently  older  chan 
the  time-manager's  global  current  observation  time  ~r\  a 
requesting  agent  does  not  perform  its  considered  action  until 
it  receives  the  confirmation.  If  in  the  interim,  the  requesting 
agent  receives  any  messages  which  result  in  an  update  of  its 
last-observed  time,  the  confirmation  is  ignored. 

Reporter  Agent 

Instances  of  the  reporter  agent  class  are  used  to 
asynchronously  output  various  ELINT  reports  to  displays 
and/or  files,  for  example,  threat  reports  and  periodic  situation 
hoard  reports.  In  addition,  instances  of  a  specialization  of 
the  reporter  class,  dchug-tracc-reporter,  are  used  during 
application  program  debugging  to  asynchronously  output 
debugging  traces  in  a  manner  that  minimally  impacts  system 
timing  dependencies. 


4.2.  FI  INF  Agent  Organization 

[he  ELINT  agents  are  basically  organized  as  a  pipeline  with 
lenlicated,  stages  where  each  stage  is  an  agent.  Inter-pipeline 
je:  ndciv s  and  dependencies  between  replicated  stages  are 
managed  by  emitter-manager  and  cluster-manager  agents.  The 
amount  of  replication  (i.e.,  the  number  of  agents)  at  each 
pipeline  stage  is  a  function  of  that  stage.  For  some  stages, 
the  number  of  replicated  agents  at  that  stage  is  fixed  during 
system  initialization.  For  example,  the  numbers  of 
observation-handler  agents,  emitter-manager  agents,  and 


"This  action  rdTetls  the  expectation  knowledge  that  if  an  emitter  within 
the  area  of  observation  is  observed  at  time  I,  then  it  3  expected  that  il  will 
be  observed  at  time  f+1. 

^Sime  each  input  observation  stream  is  in  observation-lime  sequential 
order,  each  observation-handler  eventually  knows  when  such  a  time  boundary 
is  crossed. 

l^This  over  ‘ad  may  be  more  perceived  than  actual.  A  more  recent 
implementation  of  FLINT  uses  such  "end-of-observalion-lime-inlerva!” 
messages.  Initial  results  seem  to  indicate  that  the  associated  cost  is  not 
excessive  (see  [16]) 


cluster-manager  agents  are  pre-determined  based  on  the 
number  of  collection  sites  and  their  output  data  rates.  The 
numbers  of  emitter  stages  and  cluster  stages  ary  during  the 
course  of  execution  since  the  corresponding  emitter  agents  and 
cluster  agents  are  created  and  deleted  as  the  radar  emitters  and 
collections  of  radar  emitters  which  they  represent  appear  and 
disappear  over  time. 

The  ovei  i  organization  of  the  ELINT  agents  is  illustrated  in 
Figure  4-2. 


Figure  4-2:  The  overall  ELINT 

agent  communication  organization. 


5.  An  Overview  of  CARE 

The  CARE  architectural  specification  and  its  simulation 
environment  provide  a  parameterized  and  instrumented 
multiprocessor  simulation  testbed  designed  to  aid  research  in 
alternative  parallel  architectures.  The  testbed  executes  within 
SIMPLE,  a  hierarchical,  event-driven  simulator  [3], 

A  CARE  architecture  is  a  grid  of  tens  to  hundreds  of 
processing  sites  interconnected  via  a  dedicated 
communications  network.  The  network  uses  dynamic, 
buffered,  cut-through  routing,  and  it  supports  multicast 
inter-site  message  transmission.  The  ELINT  experiment,  for 
example,  was  performed  on  various  square  CARE  grids  of 
hexagor.ally  connected  sites,  that  is,  each  site,  excluding  those 
at  the  edges  of  the  grid,  is  connected  to  six  of  its  eight 
nearest  neighbors. 

As  shown  in  Figure  5-1,  each  CARE  site  consists  of  an 
evaluator,  a  general-purpose  processor-memory  pair;  an 
operator,  a  dedicated  communications  and  process  scheduling 
processor  which  shares  memory  with  the  evaluator;  and 
network  interfaces  --  net-inputs  and  net-outputs  --  that 
accomplish  pipelined  message  transmission,  flow  control, 
deadlock  avoidance,  and  routing.  Each  net-input  at  a  site 
may  establish  a  connection  with  a  net-output  at  any  site,  and 
all  such  connections  at  a  site  may  be  simultaneously  active. 


Application-level  computations  take  place  in  the  evaluator. 
The  operator  performs  two  duties.  As  a  communications 
processor,  it  is  responsible  for  initiating  and  receiving 
messages.  As  a  scheduling  processor,  it  queues 
application-level  processes  for  execution  in  the  evaluator. 
Message  routing  is  performed  by  the  net-input  and  net-output 
network  interfaces. 

In  our  simulation  of  CARE,  the  evaluator  is  treated  as  a 
"black  box"  Lisp  processor.  None  of  its  internal  operation  is 
simulated.  The  Lisp  machine  hosting  the  simulation  serves  as 
the  evaluator  in  each  processing  site  The  operator,  however, 
is  functionally  simulated,  and  the  network  interfaces  are 
simulated  and  instrumented  in  great  detail. 

CARE  allows  a  number  of  parameters  of  the  processor  grid  to 
be  adjusted.  Among  these  parameters  are:  the  speed  of  the 
evaluator,  the  speed  of  the  communications  network,  the 


network  routing  algorithm,  and  the  speeds  of  the  process 
creating  and  switching  mechanisms.  By  altering  these 
parameters,  a  single  processor  grid  specification  can  be  made 
to  simulate  a  wide  variety  of  actual  multiprocessor 
architectures  For  example,  we  can  experiment  with  the 
optimal  level-of-granularity  of  problem  decomposition  by 
varying  the  speed  of  both  process-switching  and 
communications.  Alternative  svetwork  topologies  can  be 
studied  by  using  SIMPLE's  graphic  interfaces  and  composition 
operators  to  configure  CARE  components  into  any  topology 
that  can  be  wired. 

The  CAPF  simulation  environment  provides  detailed  displays 
of  such  information  as  evaluator,  operator,  and 
communication  network  utilization,  and  process  scheduling 
latencies.  This  instrumentation  package  informs  developers  of 
CARE  applications  of  how  efficiently  their  systems  make  use 
of  the  simulated  hardware. 

A  more  detailed  description  of  CARE  is  given  in  [16],  and 
the  technology  considerations  underlying  the  CARE 
architecture  are  discussed  in  Appendix  I. 


Figure  5-1:  A  hexagonally  connected  CARE  grid. 


6.  Results  and  Conclusions 

The  CARE,  architectural  simulation  testbed  and  the  CAOS 
system  we  have  described  have  been  fully  implemented,  and 
they  are  in  use  by  several  groups  within  our  Architectures 
Project.  CAOS-CARE  executes  on  the  Symbolics  3600  family 
of  mach-  -s  as  well  as  on  the  Texas  Instruments  Explorer 
Lisp  macmne  EL.1NT,  as  described  in  Sections  2  and  4,  has 
also  been  fully  implemented,  and  we  have  analyzed  its 
performance  on  various  size  CARE  grids. 


6.1.  Evaluating  CAOS 

CAOS  is  a  rather  special-purpose  environment,  and  it  should 
he  evaluated  with  respect  to  the  programming  of  concurrent, 
real-time  signal  interpretation  systems.  In  this  section,  we 
explore  CAOS’s  suitability  along  the  dimensions  of 
expressiveness,  efficiency,  and  scalability. 


6.1.1.  Expressiveness 

When  we  ask  that  a  language  be  suitably  expressive,  we  ask 
that  its  primitives  be  a  good  match  to  the  concepts  the 
programmer  is  trying  to  encode.  The  programmer  should  not 
need  to  resort  to  low-level  "hackery"  to  implement  operations 
which  ought  to  he  part  of  the  language.  We  believe  we  have 
succeeding  in  meeting  this  goal  for  CAOS  (although  to  date, 


only  CAOS's  designers  have  written  CAOS  applications). 
Programming  in  CAOS  is  essentially  programming  in  Lisp 
using  objects  but  with  added  features  for  declaring, 
initializing,  and  controlling  concurrent,  real-time  signal 
interpretation  applications. 


6.1.2.  Efficiency 

CAOS  has  a  very  complicated  architecture.  The  lifetime  of  a 
message  involves  numerous  processing  states  and  scheduler 
interventions.  Much  of  this  complexity  derives  from  the 
desire  to  support  alternate  scheduling  policies  within  an  agent. 
The  cost  of  this  complexity  is  approximately  one  order  of 
magnitude  in  processing  latency.  For  the  common  settings  of 
simulation  parameters,  CARE  messages  are  exchanged  in  about 
2  to  3  milliseconds,  while  CAOS  messages  require  about  30 
milliseconds.  It  is  this  cost  which  forces  us  to  decompose 
applications  coarsely,  since  more  fine-grained  decompositions 
would  inevitably  require  more  message  traffic. 

We  conclude  that  CAOS  does  not  make  efficient  use  of  the 
underlying  CARE  architecture.  This  conclusion  has  lead  to  an 
evolution  of  both  CAOS  and  CARE  which  is  described  briefly 
in  Section  6.3  and  in  detail  in  [16], 


6.1.3.  Scalability 

A  system  which  scales  well  is  one  whose  performance 
increases  commensurately  with  its  size.  Scalability  is  a 
common  metric  by  which  multiprocessor  hardware 
architectures  are  judged.  For  example,  does  a  100-processor 
realization  of  a  particular  architecture  perform  ten  times 
better  than  a  10-processor  realization  of  the  same 
architecture?  Does  it  perform  only  five  times  better,  only 
just  as  well,  or  does  it  perform  even  worse?  In  hardware 
systems,  scalability  is  typically  limited  by  various  forms  of 
contention  in  memories,  busses,  etc.  The  100-processor 
system  might  be  no  faster  than  the  10-processor  system 
because  all  interpro-essor  communications  are  routed  through 
an  element  which  is  only  fast  enough  to  support  ten 
processors. 

We  ask  the  same  question  of  a  CAOS  application.  Does  the 
throughput  of  FLINT,  for  example,  increase  as  we  make  more 
processors  available  to  it?  This  question  is  critical  for 
CAOS-based,  real-time  interpretation  systems.  Our  only 
means  of  coping  with  arbitrarily  high  data  rates  is  by 
increasing  the  number  of  processors. 

We  believe  CAOS  scales  well  with  respect  to  the  number  of 
available  processors.  The  potential  limiting  factors  to  its 
scaling  are  increased  software  contention,  such  as  the 

inter-pipeline  bottlenecks  described  in  Section  3,  and 
increased  hardware  contention,  such  as  overloaded  processors 
and/or  communication  channels.  Software  contention  can  be 
minimized  by  the  design  of  the  application.  Communications 
contention  can  be  minimized  by  executing  CAOS  on  top  of 
an  appropriate  hardware  architecture  such  as  that  afforded  by 
CARE.  CAOS  applications  tend  to  be  coarsely  decomposed. 
They  are  bounded  by  computation,  rather  than 

communication,  and  communications  loading  was  not  a 
problem  in  our  ELI NT-CAOS-CAR E  experiment. 

Unfortunately,  processor  loading  remains  an  issue.  A 
configuration  with  poor  load  balancing  in  which  some  CAR! 
sites  are  busy  while  others  are  idle  does  not  scale  well. 
Increased  throughput  is  limited  by  contention  for  processing 
resources  on  overloaded  sites  while  resources  on  unloaded  sites 
go  unused.  The  problem  of  automatic  load  balancing  is  not 
addressed  by  CAOS  as  agents  are  simply  assigned  to  processing 
sites  on  a  round-robin  basis  with  no  attempt  to  keep 
potentially  busy  agents  apart.  We  currently  have  no  solution 
to  the  problem  of  processor  load  balancing  heyond  that  of 
carefully  "hand  crafting”  a  site  allocation  strategy  for  each 
application  and  then  "tuning”  that  strategy  via  succesive 
refinement. 
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6  2.  Evaluating  FLINT  Under  CAOS 

The  input  data  set  used  for  most  of  our  EL1NT-CAOS  runs 
was  based  on  a  scenario  involving  16  aircraft  mounting  a 
total  of  88  radar  emitters  with  between  4  and  45  emitters 
active  and  observed  during  any  one  data  time  interval.  The 
scenario  takes  place  in  a  60  by  80  mile  area  over  36  time 
units,  and  it  involves  1040  separate  emitter  observations. 

Our  experience  with  EL1NT  indicates  that  the  prim?ry 
determiner  of  throughput  and  solution  quality  is  the  strategy 
used  in  making  individual  agents  cooperate  in  producing  the 
desired  interpretation.  Of  secondary  importance  is  the  degree 
to  which  irocessing  load  is  evenly  balanced  over  the  processor 
grid.  We  now  discuss  the  impact  of  these  factors  on  ELINT's 
performance. 

The  following  three  "control"  strategies  were  used  in  our 
experiment: 

1.  NC:  This  "no  control"  strategy  represents  limited 
inter-agent  control.  Agents  initiate  actions 
independently.  Whenever  an  agent  wants  to 
perform  an  action,  it  does  so  as  soot,  as  processing 
resources  are  available  for  example,  whenever  an 
observation-handler  agent  needs  a  new  emitter 
agent,  it  simply  creates  it  with  no  attempt  to 
coordinate  this  creation  with  other 
observation-handlers.  As  a  result,  multiple, 
non-communicating  copies  of  an  emitter  may  be 
created,  and  each  copy  receives  a  only  portion  of 
the  input  data  it  requires.  The  NC  strategy  was 
expected  to  produce  qualitatively  poor  results,  and 
it  was  primarilly  intended  only  as  a  baseline 
against  which  to  compare  more  realistic  control 
strategies.  What  was  surprising  was  that  the 
strategy  also  produced  quantitatively  poor  results 
(see  below). 

2.  CC:  In  this  strategy,  agents  cooperate  in  the 

creation  of  new  agents  via  manager  agents  as 
described  in  Section  4.  The  manager  agents  assure 
that  only  one  copy  of  an  agent  is  created, 
irrespective  of  the  number  of  simultaneous 

creation  requests.  All  requestors  are  returned  a 
reference  to  the  single  new  agent.  Originally,  we 
believed  the  CC  (for  "creation  control")  strategy 
would  be  sufficient  for  EL1NT  to  produce 
satisficing  high-level  interpretations.  Our 
experiment  results  showed  that  this  was  not  always 
th'1  case  (see  below). 

3.  CT:  The  CT  ("creation  and  time  control")  strategy 
was  designed  to  additonally  manage  the  skewed 
views  of  real-world  time  which  develop  in  agent 
pipelines.  For  example,  this  strategy  prevents  an 
emitter  agent  from  deleting  itself  when  it  has  not 
received  a  new  observation  in  a  while  even  though 
some  observation-handler  agent  has  sent  the 
emitter  an  observation  which  it  has  yet  to  receive. 

The  agents  corresponding  to  the  CT  strategy  are 
those  described  in  Section  4. 

Table  6-1  illustrates  the  qualitative  effects  of  the  various 
control  strategies  and  grid  sizes.  The  table  presents  the  six 
major  performance  attributes  by  which  the  quality  of  an 
ELINT  run  is  measured.  Since  the  input  data  for  the  FLINT 
experiment  were  generated  from  known  scenarios,  it  was 
possible  to  compare  the  results  of  an  ELINT  run  with 
"ground  truth." 


Table  6-1:  ELINT  Solution  Quality  Versus 
Control  Strategies  and  Grid  Sizes. 


Qualitative 

performance 

Control  strategy/grid  size 

attribute 

NC/16 

CC/16 

CC/36 

CT/4 

CT/16 

CT36 

False  alarms 

1 % 

0 

0 

0 

0 

0 

Reincarnation 

49 % 

42 

2 

0 

0 

0 

Confidences 

19% 

20 

90 

89 

93 

95 

Fixes 

48% 

42 

99 

100 

too 

too 

Threats 

65% 

63 

81 

87 

87 

90 

Fusion 

0% 

0 

77 

85 

88 

89 

The  major  qualitative  performance  attributes  are: 

False  Alarms:  This  attribute  is  the  percentage  of  emitter  agents 
that  ELINT  should  not  have  hypothesized  as  existing  with 
respect  to  the  total  number  of  emitter  agents  hypothesized. 

ELINT  was  not  severely  impacted  by  false  alarms  in  any  of 
the  control  configurations  in  which  it  was  run  as  the 
knowledge  used  for  hypothesizing  new  emitters  was  quite 
conservative.  That  is,  the  knowlege  was  such  that  it  prefered 
missing  a  true,  but  low  confidence,  emitter  to  creating  a  false 
alarm  emitter. 

Reincarnation:  This  attribute  is  the  percentage  of  recreated 
emitter  agents,  that  is,  emitters  which  had  previously  existed 
but  had  erroneously  deleted  themselves  due  to  lack  of  recent 
observations,  with  r  ‘spect  to  the  total  number  of  emitters 
created.  Large  numbers  of  reincarnated  emitters  indicate 
some  portion  of  ELINT  is  unable  to  keep  up  with  the  data 
rate.  This  can  be  caused  by  the  data  rate  being  too  high 
globally  so  that  all  processing  sites  are  overloaded  or  by  the 
data  rate  being  too  high  locally  due  to  poor  load  balancing  so 
that  some  subset  of  the  processing  sites  are  overloaded. 

The  CT  control  strategy  was  designed  to  prevent 
reincarnations.  Hence,  none  occurred  when  CT  was  employed 
on  any  size  grid.  When  the  CC  strategy  was  used,  only  the  36 
site  grid  was  large  enough  for  ELINT  to  sufficently  keep  up 
with  the  input  data  rate  so  that  emitters  were  not  erroneously 
deleted  due  to  overload. 

Confidence  Uvel:  This  attribute  is  the  percentage  of  correctly 
deduced  confidence  levels  for  the  existence  of  an  emitter  with 
respect  to  the  total  number  of  times  such  confidence  levels 
were  determined. 

For  each  hypothesized  emitter,  ELINT  maintains  a  dynamic 
confidence  level  for  the  existence  of  the  emitter  based  on 
accumulating  evidence  (see  Section  4.1).  The  correct 
calculation  of  confidence  levels  depends  heavily  on  the  system 
being  able  to  cope  with  the  incoming  data  rate.  One  way  to 
improve  confidence  levels  was  to  use  a  large  processor  grid. 
The  other  was  to  employ  the  CT  control  strategy. 

Fixes:  This  attribute  is  the  percentage  of  correctly-calculated 
positional  fixes  of  emitters  with  respect  to  the  total  number 
of  times  fixes  could  have  been  determined  from  the  ground 
truth  data. 

A  fix  can  be  computed  whenever  an  emitter  has  seen  at  least 
two  observations  from  different  collection  sites  in  the  same 
data  time  interval.  If.  for  example,  an  emitter  is  undergoing 
reincarnation,  it  will  not  accumulate  enough  data  to  regularly 
compute  fixes.  Thus,  the  approaches  which  minimized 
reincarnation  tended  to  maximize  the  correct  calculation  of 
fix  information. 

Threats:  As  described  in  Sections  2  and  4,  certain  emitter  and 


cluster  events  represent  immediate  threats.  This  attribute  is 
the  percentage  of  recognized  threats  with  respect  to  the  total 
number  of  threat  events  based  on  the  ground  truth  data. 

Fusion:  This  attribute  is  the  percentage  of  correct  clustering  of 
emitter  agents  to  cluster  agents.  The  correct  computation  of 
fusion  appeared  to  be  related,  in  part,  to  the  correct 
computation  of  confidence  levels.  The  fusion  process  is  also 
the  most  knowledge-intensive  computation  in  ELINT,  and  our 
imperfect  results  indicate  the  extent  to  which  ELINT's 
knowledge  is  incomplete. 

The  overall  goal  of  the  control  strategy  experiments  was  to  see 
if  it  was  possible  to  determine  strategies  where  the  quality  of 
the  output  results  were  relatively  insensitive  to  grid  size  and 
load  balance  but  still  achived  significant  concurrency. 

We  interpret  from  Table  6-1  that  the  control  strategy  has  the 
greatest  impact  on  the  quality  of  results.  The  CT  strategy 
produced  high-quality  results  irrespective  of  the  number  of 
processors  used.  The  CC  strategy,  which  is  much  more 
sensitive  to  processing  delays,  performed  nearly  as  well  only 
on  the  36  site  grid.  We  believe  the  added  complexity  of  the 
CT  strategy,  while  never  detrimental,  is  primarily  beneficial 
when  the  interpretation  system  might  be  overloaded  hy  high 
data  rates  or  poor  load  balancing. 

Table  6-2  gives  the  simulated  execution  times  for  the  ELINT 
runs  used  to  derive  the  data  in  Table  6-1,  and  Table  6-3  gives 
the  total  CAOS  message  counts  for  these  runs. 

Tables  6-2  and  6-3  clearly  show  that  the  processing  cost  of 
added  control  is  far  outweighed  by  the  benefits  in  its  use. 
Far  less  message  traffic  is  generated,  and  the  overall  simulated 
time  is  reduced.  Note  that  for  the  runs  whose  execution 
times  are  shown  in  Table  6-2,  the  input  data  rate  was  .1 
seconds  per  ELINT  time  unit.  Since  the  input  data  set  used 
for  these  runs  spanned  36  time  units,  the  last  observation  was 
fed  into  the  system  at  3.6  (simulated)  seconds.  Hence,  this  is 
the  minimum  possible  simulated  execution  time  for  these 
runs. 


Table  6-2:  Simulated  ELINT  execution  times  for 
various  control  strategies  and  grid  sizes. 


Control 

strategy 

Grid  size 

4 

16 

36 

NC 

>1 1.19  sec. 

CC 

10.87 

5.12 

CT 

11  80 

8.10 

4  17 

Table  6-3:  CAOS  message  counts  for  ELINT  executions 
with  various  control  strategies  and  grid  sizes. 


Control 

strategy 

Grid  size 

4 

16 

36 

NC 

>16118  msg. 

CC 

7375 

4823 

CT 

4516 

4703 

4616 

Table  6-4  and  Figure  6-1  show  the  quantitative  effect  of 
processor  grid  size  when  the  CT  control  strategy  is  employed. 
These  results  were  produced  with  the  input  data  rate  set  ten 
times  higher  (.01  seconds  per  ELINT  time  unit)  than  that 
used  to  produce  Table  6-2.  The  minimum  possible  simulated 
execution  time  for  the  runs  used  to  produce  Table  6-4  is  0.36 
seconds. 


Table  6-4:  Simulated  ELINT  execution  time  versus  grid 

size  for  production  runs  using  CT  control  strategy. 


Grid  size 

Execution  time 

1 

9.476  sec. 

4 

3.237 

9 

1517 

16 

.761 

25 

541 

36 

557 

As  shown  in  Figure  6-1,  the  speedup  achieved  by  increasing 
the  processor  grid  size  is  nearly  linear  in  the  1  to  25 
processor  site  range.  However,  the  36  site  grid  was  slightly 
slower  than  the  25  site  grid.14 


Figure  6-1:  The  relative  speedup  of  ELINT 

executions  on  various  size  CARE  grids. 


In  this  last  case,  there  was  not  sufficient  data  per  ELINT  time 
interval  to  warrant  the  additional  processors.  That  is,  there 
was  not  enough  concurrency  to  exploit  36  processors.  This 
can  be  seen  from  Table  6-5  which  gives  timing  results  for 
larger  data  sets  with  more  emitters  and  observations  during 
each  time  interval  and,  hence,  more  potential  for  concurrency. 


14Because  of  the  intrinsic  non-determinism  of  a  CARE  architecture,  we 
.served  variations  in  the  so.ution  qualities  and  the  -  ~ 

fferent  runs  of  the  same  input  data  set  on  the  same  snt  CMf.  t"*.  *°r 
ch  runs  the  variations  in  solution  qualities  never  exceeded  a  fraction  of  a 
rLT  However,  the  varitions  in  run  times  where  as  much  as f.*«  1™ 
his  accounts  for  the  slightly  longer  execution  time  on  36  versus  25 
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Table  6-5:  Simulated  ELINT  execution  times 
and  speedno  for  larger  data  sets. 


Number  of 
Observations 

1-site  grid 
execution  time 

36-site  grid 
execution  time 

Speedup  of 

36  over  1 

1040 

9.476  sec. 

557  ssc. 

17.0 

2080 

15.10 

.948 

26.5 

4160 

55.87 

2.259 

24.7 

As  shown  in  this  table,  for  an  input  data  set  representing 
twice  as  many  emitters  and  observations  than  the  basic  data 
set,  the  36  site  grid  achived  a  speedup  factor  of  26.5  (as 
opposed  to  a  speedup  of  17.0  for  the  basic  data  set)  over  a 
single  processor.  However,  for  a  data  set  four  times  larger 
than  the  basic  data  set,  the  speedup  factor  was  only  24.8. 
This  was  because  this  larger,  and  hence  more  concurrent,  data 
set  saturated  the  36  site  grid.  That  is,  the  2080  observation 
data  set  already  provided  enough  irrency  to  fully  exploit 
the  36  site  grid. 


6.3.  Some  Open  Questions 

CAOS  has  been  a  suitable  framework  in  which  to  construct 
concurrent  signal  interpretation  systems,  and  we  expect  many 
of  its  concepts  to  be  useful  in  our  future  computing 
architectures.  Of  principal  concern  to  us  now  is  increasing 
the  efficiency  with  which  the  underlying  CARE  architecture  is 
used.  In  addition,  our  experience  suggests  a  number  of 
questions  to  be  explored  in  future  research: 

•  What  is  the  appropriate  level  of  granularity  at 
which  to  decompose  problems  for  CARE-like 
architectures? 

•  What  is  the  most  efficient  means  to  synchronize 
the  actions  of  concurrent  problem  solvers  when 
necessary? 

»  How  can  flexible  scheduling  policies  be 
implemented  without  significant  loss  of  efficiency? 

What  is  the  impact  on  problem  solving  if  alternate 
scheduling  policies  are  not  provided? 

•  Are  there  efficient  mechanisms  for  dynamically 
balancing  processor  loads? 

We  have  started  to  investigate  these  questions  in  the  context 
of  a  new  CARE  environment.  One  of  the  primary  difference 
between  the  original  environment  and  the  new  environment  is 
that  the  process  is  no  longer  the  basic  unit  of  computation. 
While  the  new  CARE  system  still  supports  the  use  of 
processes,  it  emphasizes  the  use  of  contexts  which  are 
computations  with  less  state  than  those  of  processes. 

When  a  context  is  forced  to  suspend  to  await  a  value  fiom  a 
remote  service,  it  is  aborted,  and  restarted  from  scratch  later 
when  the  value  is  available  This  behavior  encourages  more 
fine-grained  decomposition  of  problems  written  in  a 
functional  style  where  individual  methods  are  small  and 
corsist  of  a  binding  phase  followed  by  an  evaluation  phase. 

In  addition,  CARE  now  supports  arbitrary  prioritization  of 
messages  delivered  to  streams.  As  a  result,  it  is  no  longer 
necessary  to  include  in  CAOS  a  complex  and  expensive 
scheduling  strategy.  Early  indications  are  that  the  new  CARE 
environment  with  a  slightly  modified  CAOS  environment 
performs  around  two  orders  of  magnitude  faster  than  the 
configuration  described  in  this  paper.  The  evolution  of 
CARF  and  CAOS  based  on  the  results  of  our  EL1NT-CAOS 
-CARF  experiment  is  described  in  greater  detail  in  [16], 
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APPENDIX 

1.  Technology  Considerations  Underlying  the 
CARE  Architecture 

The  CARE  simulation  testbed  can  be  used  to  simulate  shared 
memory  as  well  as  message  passing  multiprocessor 
architectures.  For  example,  it  has  been  configured  to  simulate 
a  single  address  space,  shared  global  memory  architecture 
where  the  processors  (and  their  local  cache  memories)  are 
connected  to  the  shared  memory’s  controllers  via  a  switching 
network.  However,  the  intended  focus  of  the  CARE  testbed  is 
on  message  passing,  multiprocessor  architectures  where  each 
processor  has  significant  local  memory.  This  focus  is  based 
on  technology  considerations  --  primarily  communication 
versus  processing  costs. 

The  base  for  development  of  general  purpose  multiprocessor 
systems,  as  for  computer  systems  generally,  is  given  by  the 
design  constraints  and  opportunities  established  by  evolving 
semiconductor  design  and  manufacturing  processes.  The  VLSI 
design  medium  brings  a  new  perspective  on  cost  --  switches 
are  cheap  while  wires  are  expensive.  Communication  costs 
dominate  those  associated  with  logic.  Communication  is 
currently  the  resource  in  shortest  supply,  and  it  will  become 
more  of  a  constraint  rather  than  less  as  semiconductor 
lithographies  decrease. 

The  consequence  of  relatively  expensive  communication  is 
that  performance  is  enhanced  if  the  design  establishes  that 
whenever  a  lot  of  information  has  to  move  in  a  short  time,  it 
does  not  have  to  move  far.  Significant  locality  of  high 
bandwidth  links  is  a  design  goal.  Among  the  highest 
bandwidth  links  in  a  computer  system  are  those  connecting 
the  processor  and  memory.  Thus,  close  coupling  of  processors 
with  local  memory  is  preferred. 

To  reduce  demand  on  the  communications  resource  to 
supportable  levels,  local  memory  sizes  for  multiprocessors  can 
be  expected  to  increase  to  the  100K  byte  level  and  beyond, 
and  block  transfers  between  backing  store  and  such  several 
hundred  kilobyte  local  memories  will  be  used  to  make  the 
most  efficient  use  of  both  memory  structures  and 
communications  facilities.  Moreover,  the  functionality  of 
memory  controlers  will  expand  to  include,  for  example, 
management  of  request  queues,  the  dispatching  of  results,  and 
execution  of  synchronization  primitives;  and  thus,  the 
distinctions  between  a  memory  controller  and  a  small,  simple 
processor  will  become  blurred. 

The  proportion  of  area  for  a  simple,  high  performance 
processor  to  the  total  area  of  a  site  with,  for  example,  256K 
bytes  of  local  storage  can  be  reasonably  estimated  at  around 
15%.  From  (i)  this  estimate  of  the  incremental  cost  of 
adding  a  processor  to  a  block  of  memory,  (ii)  the  significant 
size  of  the  total  local  storage  in  the  system,  (ni)  the  blurring 
of  distinctions  between  fast,  simple  processors  and  memory 
controllers  of  increasing  complexity,  and  (iv)  the  tendency 
towards  block  tranfers  between  local  memo:  and  backing 
store,  it  follows  that  the  level  of  the  storage  hierarchy  now 
labeled  as  "random  access  memory"  is  likely  to  be  subsumed 
by  a  combination  of  large  local  memories  and  fast,  block 
access  backing  stores  in  multiprocessor  systems. 
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The  performance  of  the  available  communication  resource 
merits  special  attention  in  the  des.gn  of  multiprocessor 
systems.  For  example,  dynamic  routing  which  selects  available 
inter-site  links  as  needed  is  useful  in  balancing  load,  and  thus 
it  allows  more  of  the  communication  resource  of  the  system 
to  be  exploited  throughout  a  computation,  Cut-though 
routing  which  makes  a  routing  decision  on  the  fly  as  a  packet 
is  received  reduces  buffer  requirements  in  the  system  and 
minimizes  latency  experienced  in  network  transit.  Flow 
control  via  signalling  transmission  delays  back  to  the  source 
based  on  local  blockage  information  together  with  single 
"word"  buffering  and  transmission  validation  at  each  network 
input  and  output  port  allows  the  source  to  complete  a 
transmission  in  a  time  that  does  not  depend  on  the  size  of 
the  network,  Point  to  point  multicast  which  sends 
(approximately)  the  same  packet  to  multiple  targets  using 
common  resources  to  the  largest  degree  possible  can 
significantly  enhance  overall  communication  performance.  A 
communication  resource  with  these  features  provides  a 
multiprocessor  system  with  "virtual  busses  that  are  established 
precisely  as  and  when  they  are  needed. 

These  technology  considerations  have  led  us  to  focus  our 
attention  on  the  class  of  multiprocessor  hardware  system 
architectures  exemplified  by  CARE. 
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Abstract 


Simulation  of  systems  at  an  architectural  level  can  offer  an 
effective  way  to  study  critical  design  choices  if  (1)  the 
performance  of  the  simulator  is  adequate  .j  examine  designs 
executing  significant  code  bodies  —  not  just  toy  problems  or 
small  application  fragments,  (2)  the  details  of  the  simulation 
include  the  critical  details  of  the  design,  (3)  the  view  of  the 
design  presented  by  the  simulator  instrumentation  leads  to 
useful  insights  on  the  problems  with  the  design,  and  (4)  there 
is  enough  flexibility  in  the  simulation  system  so  that  the 
asking  of  unplanned  questions  is  not  suppressed  by  the  weight 
of  the  mechanics  involved  in  making  changes  either  in  the 
design  or  its  measurement.  A  simulation  system  with  these 
goals  is  described  together  with  the  approach  to  its 
implementation.  Its  application  to  the  study  of  a  particular 
class  of  multiprocessor  hardware  system  architectures  is 
illustrated. 


1.  INTRODUCTION 

Simulation  systems  are  quite  often  developed  in  the  context  of 
a  particular  problem.  To  a  degree,  this  is  true  for  SIMPLE, 
an  event  based  simulation  system,  and  CARE,  the  computer 
array  emulator  that  runs  on  SIMPLE.1  The  problem 
motivating  the  development  of  both  SIMPLE  and  CARE  was 
the  performance  study  of  100  to  1000-element  multiprocessor 
systems  executing  a  set  of  signal  interpretation  applications 
implemented  as  "1000  rule  equivalent  expert  systems" 
[Brown86j. 

A  set  of  constraints  pertinent  to  this  problem  governed  the 
design  of  SIMPLE/CARE.  The  applications  represented 
significant  bodies  of  code  and  so  simulation  run  times  were 
expected  to  be  an  important  consideration.  Moreover,  the 
issues  involved  with  the  interactions  of  multiprocessor  system 
elements  were  sufficiently  unexplored  prior  to  simulation  that 
simplifications  in  the  CARE  system  model,  specifically  with 
respect  to  element  interactions,  were  suspect.  This  need  for 
detail  was,  of  course,  in  tension  with  the  need  for  simulation 
performance.  The  ways  that  simulated  system  components 
would  be  composed  into  complete  systems  was  initially 
difficult  to  bound.  Further,  it  was  clear  that  the  models  of 


This  work  was  supported  by  DARPA  Contract  F30602-85-C-0012.  NASA 
Ames  Contract  NCC  2-220-SI,  and  Boeing  Contract  W266875.  Greg  Byrd 
was  supported  by  an  NSF  Graduate  Fellowship  and  by  the  Stanford 
University  Department  of  Flectrica!  Fngineering. 

^SIMPLE  and  CARE  were  developed  by  the  authors  at  the  Knowledge 
Systems  Lab  of  Stanford  University.  SIMPLE  is  a  descendenl  of  PALLADIO 
[Brown83]  optimized  for  the  subset  of  PALLADIO’S  capabilities  relevant  to 
hierarchical  design  capture  and  simulation.  It  is  written  in 

Zetalisp  [Weinreb81j  and  currently  runs  on  Symbolics  3600  machines  and  Tl 
Explorers. 


these  components  would  be  elaborated  over  time  and  would 
undergo  substantial  change  as  design  concepts  evolved.  It  was 
also  clear  that  the  ways  of  examining  the  operation  of  these 
components  would  change  independently  (and  at  a  great  rate) 
as  early  experience  indicated  what  alternative  aspect  of  system 
operation  should  have  been  monitored  in  any  given  completed 
run. 

The  design  goals  that  emerged  then  were  (1)  that  the 
simulation  system  should  support  the  management  of 
substantial  flexibility  with  regard  to  simulated  system 
structure,  function,  and  instrumentation  and  (2)  that,  in  order 
to  accomplish  runs  in  acceptable  elapsed  times,  the  detail  of 
simulation  should  be  particularly  focused  on  the 
communications,  process  scheduling,  and  context  switching 
support  facilities  of  the  simulated  system  —  that  is,  on  just 
those  aspects  of  system  execution  critical  to  multiprocessor  (as 
opposed  to  uniprocessor)  operation. 


1.1.  Design  Time  Interaction  And  Run  Time  Operation 
Encapsulation  of  the  state  of  design  components  with  the 
procedures  that  manipulate  that  state  is  one  clear  way  to 
manage  design  evolution.  Such  encapsulation  partitions  the 
design  along  well  defined  boundaries.  Components  (by  and 
large)  interact  with  other  components  only  through  defined 
ports.  Connections  between  components  terminate  at  such 
ports.  When  a  system  simulation  is  initialized,  connections 
are  traced  so  that  for  every  port,  the  simulator  knows  the 
connected  (terminating)  ports  together  with  their  containing 
components.  Once  such  initialization  is  complete,  that  is, 
throughout  the  simulation  run,  assertions  about  the  state  of  a 
port  of  one  component  can  be  directly  translated  to  assertions 
about  the  state  of  connected  ports  of  other  components. 

Partitioning  issues  of  system  structure,  component  behavior, 
and  instrumentation  into  separate  domains  of  consideration 
helps  in  managing  a  design  that  is  both  fluid  and  complex. 
System  structure,  that  is,  the  relationship  between  components, 
can  be  specified  through  use  of  an  interactive,  graphics 
structure  editor  and  is  largely  independent  of  component 
function  per  se.  Component  behavior  is  encapsulated  in  a  set 
of  definitions  pertinent  to  the  given  class  of  component. 
Each  component  in  a  SIMPLE  simulated  system  is  a  member 
of  a  class  defined  for  that  component  type.  Instrumentation 
is  automatically  and  invisibly  made  part  of  the  definition  of 
each  simulated  component  that  is  to  be  monitored  during  a 
run.  This  is  done  by  arranging  that  the  class  of  every 
component  to  be  monitored  is  a  specialization  of  the  general 
instruinented-box  class.  The  basic  data  structures  and 
procedures  for  monitoring  simulated  components  and 
maintaining  the  organizational  relationships  between  each 
component  and  its  related  instrumentation  are  inherited 
through  this  general,  ancestral  class  and  are  thus  made  a 
separate,  substantially  independent  consideration  in  the  design. 

A  further  partitioning  of  concerns  is  employed  to  separate  out 
the  definition  of  the  application  programming  language 
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interface  and  its  support  (as  provided  by  CARE)  from  the 
underlying  information  flow  control  governing  component 
behavior  The  behavioral  descriptions  of  components  (which 
are  expressed  as  sets  of  condition/action  rules)  deal 
generically  with  gating  information,  independently  of  the 
structure  of  the  information,  between  ports  of  the  component 
and  its  internal  state  variables.  This  is  separated  in  the 
component  model  definitions  from  the  functions  perform  :d 
to  create  and  manipulate  the  information  so  gated,  The 
simulated  implementation  of  the  application  programming 
language  support  facilities,  on  the  other  hand,  relies  only  on 
the  specifics  of  the  information  and  its  structure  and  plays  no 
part  in  gating  it  between  the  components  of  the  system. 
Changing  the  definition  of  the  application  language  is  thus 
done  independently  of  changing  component  flow  control 
behavior  The  application  programmer  and  the  implementer 
of  the  application  language  interface  may  use  whatever  data 
structures  seem  suitable  to  them,  be  they  numbers  and 
keywords  or  procedure  bodies  and  execution  environments. 
The  simulation  system  doesn't  care. 

The  component  probe  definitions,  that  is,  the  specifications  of 
what  information  should  be  captured  for  each  component 
type,  are  separated  from  the  descriptions  of  the  behavior  of 
such  components.  In  designing  for  flexibility  in  the 
instrumentation  system,  it  turned  out  to  be  important  to 
further  divide  the  information  presentation  from  the 
information  collection  issues.  The  mapping  from  particular 
component  probes  to  particular  instrument  panels  and  the 
transformations  to  be  applied  to  the  information  as  it  passed 
from  a  given  kind  of  probe  to  a  given  panel  (and  between 
panels)  is  captured  in  the  instrument  specification.  This  is  a 
definition  of  what  kinds  of  panels  are  included  in  an 
instrument,  how  they  fit  on  an  instrument  screen,  how  they 
are  labeled  and  scaled,  and  what  information  from  which 
kinds  of  probes  are  displayed  on  each  panel.  The  instrument 
specification  also  indicates  what  kinds  of  probes  are  to  be 
connected  to  which  kinds  (that  is,  which  classes)  of 
components  in  the  system. 


mechanisms  of  a  multiprocessor  applications  language.  These 
specify  the  interface  used  to  provide  the  program  input  to  the 
multiprocessor  system  being  simulated.2  The  definitions  used 
to  generate  component  probes  are  associated  with  each  library 
component  to  be  monitored.  There  may  be  several  such 
definitions,  each  appropriaU  to  measuring  a  different  aspect 
of  the  associated  componint’s  operation.  An  instrument 
specification  selects  from  these  definitions,  elaborates  them 
with  selections  from  a  set  of  probe  operation  modules  to 
include  any  pre-processing  (for  example,  a  moving  average)  to 
be  calculated  by  the  probe,  and  indicates  under  what 
conditions  what  information  from  the  probe  is  to  be  sent  to 
which  panels  of  the  instrument  and  how  it  is  to  be 
transformed  and  displayed  there.  Instrument  specifications 
also  partition  the  screen  among  the  panels  of  the  instrument. 
The  end  product  of  these  design  time  interactions  is  an 
instrumented  circuit  and  an  instrument.  The  instrument 
comprises  a  set  of  instrument  panels  and  a  set  of  constraints 
relating  them  to  the  instrument  screen.  The  instrumented 
circuit  ties  together  instances  of  components,  probes,  and 
panels  for  a  simulation  run. 

For  each  defined  class  of  component  and  its  associated 
probes,  the  design  time  interactions  produce  code  bodies  that 
accomplish  simulation  operations  during  a  run.  It  is  an 
attribute  of  the  underlying  Lisp  base  of  the  simulation  system 
that  changes  in  these  definitions  have  immediate  effect  even 
during  a  simulation  run  --  an  important  capability  during 
debugging. 


2.  STRUCTURE  AND  COMPOSITION 

Design  time  interactions  to  specify  a  system  include  the 
establishment  of  component  relationships.  Such  specifications 
can  be  said  to  accomplish  the  composition  of  the  system  from 
its  components  and  so  define  its  structure.  SIMPLE  supports 
hierarchical  composition:  components  may  be  described  in 
terms  of  a  fixed  set  of  relationships  among  their 
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Figure  1-1:  Design  Time  Interactions  and  Run  Time  Representations 


Putting  together  all  the  definitions  of  components,  component 
probes,  panels,  instruments,  applications  interfaces,  and 
nuer-component  relationships  is  done  in  a  set  of  design  time 
interactions  by  a  system  architect.  These  interactions  are  used 
by  the  simulation  system  to  generate  efficient  run  time 
representations  so  that  simulation  performance  goals  can  be 
met.  Figure  1-1  illustrates  the  partition  between  design  time 
interactions  and  simulation  run  time  operation.  Structure 
editing  pulls  together  components  from  the  component  library 
to  produce  a  circuit.  Associated  with  some  components  in  the 
library,  there  are  defi  litions  for  the  syntax  and  underlying 


2The  language  primitives  supplied  can  be  used  lo  define  multiprocessor 
language  interfaces  for  either  shared-variable  or  value-passing  paradigms.  As 
supplied,  the  language  interface  built  on  these  primitives  supports 
value-passing  on  streams  between  objects  but  alternative  interfaces  can  be 
(and  have  been)  easily  defined  in  lent  s  of  the  given  primitives. 
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sub-components.  Additionally,  such  composite  components 
may  have  function  beyt-nd  what  c°n  bp  inferred  strictly  from 
their  composition  All  this  can  then  be  included  a  higher 
level  composite  (as  shown  in  figure  2-1)  and  so  on 

fthWHiiUtl)  tr*  K»f>  (Ml  "rifj’Wl"  (hi  gysvMii  uiwIirv 
is  reached. 

The  behavior  induced  on  a  composite  component  from  its 
parts  changes  according  to  the  behavior  of  its  parts.  Thus, 
for  example  in  figure  2-1,  if  at  any  time  during  a  simulation 
the  function  of  CARE  operator  components  is  changed  by 
redefining  their  operation,  the  behavior  of  the  nine-site  grid 
is  in  immediate  correspondence.3 


net-output,  the  fifo-buffer ,  the  operator,  and  the  evaluator 
The  net-input,  net-output  and  fifo-buffer  accept  (or  block), 
route,  and  buffer  transmissions.  They  do  so  in  accordance 
with  a  dynamic,  flow-controlled,  multicast,  cut-through 
p  iinupjii— riCJit  pffrf  nJ  u  dffcriVd  in  LRtnd?7*  J  Th» 
evaluator  does  the  real  work  of  the  application:  evaluating  the 
application  of  functions  to  their  parameters.  The  operator 
does  the  overhead  woTi  associated  with  such  evaluations:  for 
example,  scheduling  processes  and  sending  and  receiving  (but 
not  routing)  messages. 

In  keeping  with  the  objective  of  focusing  simulation  cycles  on 
the  aspects  of  the  simulation  particularly  relevant  to 
multiprocessor  operation,  the  behaviors  of  the  net-input, 


Figure  2-1:  Hierarchical  Composition 


Composition  is  described  graphically  and  interactively  in 
SIMPLE  by  picking  a  previously  specified  component  type 
from  a  menu,  placing  it  in  relationship  to  other  components 
with  "mouse"  movements,  and,  through  the  same  means, 

specifying  the  connections  between  its  selected  ports  and  those 
of  other  components  (as  indicated  in  figure  2-2). 

Through  another  menu  selection,  ports  can  be  defined  for  the 
n^w  composite  component  so  that  it,  in  turn,  can  be  fitted 
into  yet  higher  level  structures.  Such  external  ports  can  be 
connected  directly  to  ports  of  sub-components  "within"  the 
composite.  If  this  is  done,  information  appearing  on  that 
external  port  will  be  the  responsibility  of  the  connected 

sub-component.  By  this  same  means,  a  component  previously 
described  as  a  base  level  component,  can  be  redefined  as  a 

composite  of  yet  lower  level  elements  as  its  design  is 

elaborated  with  further  details. 

Components  and  (internal)  connections  can  also  be  deleted 
from  a  library  component  and  replaced  with  substitute 
components.  After  all  sub-components  and  connections  have 
been  added,  deleted,  elaborated,  and  replaced  as  required,  the 
completed  structure  can  then  be  entered  into  a  library  of 
components  and  used  in  turn  to  compose  higher  or  equivalent 
level  components. 


2.1.  CARF  Base  Components 

CARE  supplies  a  small  library  of  system  level  base 
component  types.  Currently  these  are  the  net-input,  the 


3ltowever,  for  reasons  concerning  simulation  performance  and  because  of 
their  relatively  low  frequency,  changes  in  the  number  and  names  of  the 
internal  state  variables  of  components  and  the  structural  relationships  between 
sub-components  of  a  composite  are  not  reflected  in  an  already  instantiated 
circuit.  Changes  in  the  internal  structure  of  a  CARE  site  library  component, 
for  example  will  be  reflected  only  in  circuits  instantiated  after  the  change 
took  effect.  For  this  reason  and  to  reduce  long  term  storage  requirements  and 
load  time  for  the  fundamentally  iterative  circuits  that  we  primarily  study,  we 
do  not  keep  files  of  instantiated  circuits.  They  are  instantiated  as  needed 
from  a  high  level  library  component  with  the  same  prototypical  structure. 


Figure  2-2:  Graphic  Structure  Specification 


net-output,  and  fifo-buffer  component  classes  are  defined  in 
fair  detail,  that  is,  at  the  register  transfer  level.  Routing 
operations  are  described  procedurally  and  assumed  to  occur 
within  a  time  set  by  a  parameter  to  the  simulation.  As 
indicated  previously,  the  simulation  of  the  operator  and 
evaluator  is  broken  into  two  aspects:  the  control  of  the  flow 
of  information  and  the  functions  performed  on  that 
information.  The  former  is  described  in  terms  of  SIMPLE 
behavior  rules  (as  documented  in  section  3),  register  transfer 
by  register  transfer.  The  latter  is  described  directly  in  terms 
of  procedures  and  the  simulated  time  taken  by  such 
procedures  is  modeled.  In  the  case  of  the  operator,  this  is 
done  as  a  function  of  the.  number  of  storage  cells  manipulated 
during  an  operator  procedure.  In  the  case  of  the  evaluator, 
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this  is  done  as  a  function  of  the  execution  time  used  by  the 
machine  executing  the  simulation,  that  is,  the  simulation 
vehicle. 


2.2.  CARE  Composite  Components 

The  prototypical  composite  component  supplied  with  CARE  is 
the  site.  As  supplied,  it  includes  net-inputs  and  net-outputs 
for  up  to  eight  "neighboring"  components  (generally  other 
sites),  a  net-input  and  a  net-output  with  associated 
fifo-buffers  for  local  receptions  and  transmissions,  and, 
finally,  an  operator  and  evaluator  as  described  above. 
Specializations  of  the  site,  for  example,  the  torus-site,  exist  in 
the  library  to  fit  the  site  into  alternative  topologies  by 
supplementing  the  cits  routing  and  wiri-%  procedure  as 
appropriate  to  the  topology. 


2.3.  Aiiivinariv  C'Mpoatthki  CARE 

Although  any  connection  of  components  can  be  created  by  the 
means  noted  previously,  for  some  repetitive,  well  patterned 
systems  of  connections,  composition  can  be  automated.  The 
CARE  library  includes  a  component,  the  iterated-cell,  which 
represents  a  template  for  the  creation  of  composite 
components  by  iteration  of  a  unit  cell.  The  unit  cells  (for 
example,  the  torus-site)  are  specializations  of  other 
components  (for  example,  the  site)  as  just  discussed.  The 
specializations  include  a  method  for  responding  to  a  request 
to  provide  a  wiring  list.  Such  a  list  associates  each  source 
port  of  a  cell  with  the  corresponding  destination  port  (in 
terms  of  port  names)  and  the  position  of  the  destination  cell 
relative  to  the  source  cell  in  the  iterated  structure.  The 
iterated  cell  component  uses  this  information  to  make  the 
required  connections  between  each  of  its  constituent  cells. 


3.  SPECIFYING  BEHAVIOR 

SIM  I'  I  I  is  an  event  based  simulator.  The  behavior  of  a 
simulated  component  is  described  in  terms  of  responses  to  the 
events  pertinent  to  that  component.  A  component's  response 
may  include  consequent  events  to  be  handled  by  the  simulator 
as  well  -s  direct  operations  on  component  state.  Assertion  of 
consequent  events  and  the  responses  to  them  (involving 
further  consequences)  drives  the  simulation.  When  there  are 
no  more  events  to  handle,  the  simulation  is  complete. 

To  maintain  modularity  in  a  simulation  system,  responses  to 
simulation  events  should  be  local  to  the  affected  component 
and  its  defined  ports,  that  is,  its  connection  to  the  remainder 
of  the  simulated  system.  The  composition  system  of  the 
simulator  maintains  the  relationship  between  ports  of  one 
component  and  those  of  other  components  connected  to  them. 
Assertions  relative  to  a  port  of  a  component  are  thus 
systematically  translated  to  events  pertinent  to  components 
connected  to  it.  This  is  the  general  mechanism  for  event 
propagation  between  components.  In  a  limited  number  of 
cases,  a  direct  operation  on  a  related  component  may  be 
appropriate.  With  fair  warning  about  its  possibility  of  abuse, 
a  facility  is  provided  to  accomplish  this. 


3.1,  Behavioral  Rules 

The  behavior  of  a  component  is  described  in  terms  of  its 
responses  to  pertinent  events.  Each  event  stipulates  the 
component  affected,  its  port  or  state  variable  signalled  with 
an  assertion,  the  asserted  value,  and  the  simulated  "time"  of 
the  event.  The  time  of  an  event  may  be  thought  of  as  the 
"current”  simulation  time.  Differences  in  event  times 
represent  the  temporal  relationship  between  events.  Event 
times  in  SIMPLE  simulations  are  monotonically  increasing. 

For  each  type  of  component,  there  is  a  procedure  to  handle 
pertinent  events.  The  arguments  to  the  procedure  are  those 
stipulated  by  the  event  (as  just  described).  The  procedure 
tests  for  conditions  and.  as  satisfied,  asserts  or  directly  effects 


consequent  actions.  The  conditions  may  include  arbitrary 
predicates  on  the  event  parameters  and  the  state  variables  of 
the  component. 

Event  based  simulators  are  based  on  the  assumption  that  state 
and  port  variables  remain  unchanged  until  explicitly  modified. 
Synchronous  designs,  that  is,  those  in  which  the  opportunities 
for  state  change  are  temporally  quantized  to  a  clock,  can  be 
modeled  in  such  implicitly  asynchronous,  event  based 
simulators  by  asserting  the  clock  signal  on  a  port  of  each  and 
every  clocked  component  of  the  simulated  system.  If  only 
some  of  the  components  in  a  system  need  take  action  on  each 
clock  signal,  there  is  an  obvious  inefficiency  in  this  approach 
that  is  crippling  for  systems  with  even  a  modest  number  of 
components. 

tf,  however,  event  limes  In  all  evclil  baaed  anriulatul  Lit 
restricted  to  integers,  the  clock  can  be  assumed.  All  that  is 
needed  is  a  way  to  detect  the  event  for  which  a  boolean 
combination  of  conditions  as  strobed  by  an  assumed  clock  is 
Th-st  tnd.  frmrWvv  condition  ptedieww  §«TptW 
detecting  an  "edge"  (a  value  changed  by  the  current  event) 
with  a  coincident  "level"  (a  value  set  before  the  current  event) 
of  two  ports  or  state  variables  of  a  component  in  either  of 
the  two  possible  event  sequences.  The  predicate  both-states 
in  the  example  evaluator  behavior  rule  shown  in  figure 
3-1  has  these  semantics. 

Figure  3  1  illustrates  the  generality  of  SIMPLE  behavioral 
descriptions.  The  underlying  object-oriented  programming 
system,  Flavors  [Weinreb81],  in  which  SIMPLE  is 
implemented  provides  for  direct  reference  of  component  state 
variables.  The  conditions  and  actions  of  behavior  rules  for  a 
component  then  need  only  name  the  component's  port  or  state 
variable  (as  stipulated  in  the  definition  of  that  component 
type)  to  get  or  change  the  appropriate  value  in  the  component 
instance  for  which  the  event  is  pertinent.  Actions  may 
include  arbitrary  procedures:  for  example,  the  procedures 
user-eval uate  and  queue-take  in  the  given  example. 


:  :  If  the  evaluator  is  ready  and  there  is  at  least  one  runnable  process- 
((or  (both-states  Evaluator-Status4  ’ready  Eval uator-Queue-Status  'some) 
(both-states  Evaluator-Status  ’ready  Eval uator-Queue-Status  ’full)) 
;  :  -  make  it  current ,  start  evaluation,  and  adjust  status  as  per  removal. 

(setq  Evaluator-Status  ’busy)  ; block  rule 

(assert-state  Evaluator-Status  ’busy  no*)  -.next  event 

(setq  Current-Eval uatlon  (queue-take  Evaluator-Queue))  ; note  process 
(user-evaluate  Current-Evaluation  no*)  ; execute  it 

(send  self  :  eval uator-queue-decreased  no*))  -.note  change 

Figure  3-1:  Example  Condilion/Action  Behavior  Rule 


3.2.  Using  Methods 

The  environment  for  the  execution  of  the  procedures  defining 
responses  to  events  includes  the  state  variables  and  ports  of 
the  component  instance  for  which  the  event  is  pertinent. 
These  procedures  are  Flavor  methods  [Weinreb81]  (in  this 
case  corresponding  to  the  :ApplyRules  message)  of  the 
component  type  and,  as  just  noted,  refer  implicitly  to  the  state 
variables  of  the  component  instance  handling  the  event. 
Other  methods  may  be  defined  for  simulated  components:  for 
example,  the  -.evaluator-queue-decreased  method  invoked 
in  figure  3-1.  Such  methods  have  proved  to  be  a  natural  way 
to  realize  the  functional  operations  of  components  not 
described  by  behavior  rules. 

The  composition  system  leaves  information  about  the 
enclosing  and  contained  component  instances  for  each 
simulated  component  in  system  defined  state  variables  of  that 
component.  With  this  information,  methods  directly 
referencing  the  ports  and  state  variables  of  such  related 
components  may  be  invoked  as  needed.  This  is  a  useful  but 


4By  convention,  component  state  variables  are  written  in  capitalized  form. 
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sharp-edged  facility.  The  warning  about  loss  of  modularity 
given  previously  applies  here. 


4.  INSTRUMENTATION 

The  results  of  a  simulation  are  primarily  the  insights  it 
provides  into  the  operation  of  the  simulated  system.  The 
"insight”  we  frequently  experienced  using  an  early  version  of 
the  simulation  system  was  that  more  interesting  results  could 
have  been  produced  by  the  run  just  completed  if  only  the 
instrumentation  had  been  different.  With  this  in  mind,  the 
design  for  the  current  version  of  the  simulation 
instrumentation  system  was  aimed  at  flexibility.  This  was 
attained  without  significant  performance  impact  by  building 
efficient  run-time  system  structures  before  each  run,  as 
outlined  in  section  1.1,  from  the  declarations  defining  the 
instrumentation. 

The  organization  of  the  instrumentation  system  is  pictured  in 
figure  4-1.  The  simulator  interacts  with  component  instances 
through  assertions,  that  is,  calls  on  an  assert  function,  i 
behavior  rules  (the  methods  associated  with  : AppIyRulee 
messages).  All  instrumented  components  are  specializations  of 
an  instrumented-box  (as  well  as  other  classes).  After  each 
invocation  of  tApplyRules  for  such  components,  the 
: ApplyRules  method  for  a  generic  instrumented-box  is 
applied.  This  causes  invocation  of  the  :  trigger  method  for 
each  component-probe  associated  with  that  component.  Since 
this  flow  of  measurements  is  accomplished  by  means  nvisible 
to  the  the  writer  of  behavior  methods  for  a  component,  the 
concerns  surrounding  component  design  are  effectively 
partitioned  from  component  instrumentation.  The  remainder 
of  this  section  details  these  "invisible"  means  used  to 
accomplish  measurement  flow  during  a  simulation  run  as  the 
measurements  are  staged  from  components  through  component 
probes  to  instrument  panels. 


I  igure  4-1:  Instrument  System  Organization 


4.1.  Component  Probes 

The  first  filtering  of  events  is  done  by  component  probes. 
Some  events  cause  no  further  measurement  activity  since,  as  it 
turns  out,  not  all  events  merit  action  on  the  part  of  the 
instrumentation  system.  The  parameters  of  the  event  and  the 
ports  and  state  variables  of  the  instrumented  component 
dealing  with  the  event  are  available  to  the  component  probe 
as  are  the  state  variables  of  the  probe  itself.  Each  piece  of 
the  selected  information  is  tagged  with  an  identifying  keyword 
and  passed  along  as  the  parameters  of  the  :  trigger  method 
along  with  a  keyword  identifying  the  type  of  component 


probe,  a  number  representing  the  current  event  time,  and  a 
pointer  to  the  component  with  which  the  information  is  to  be 
associated  in  the  display.  This  pointer  might  be  to  some 
component  related  to  the  one  actually  handling  the  event,  for 
example,  the  component  enclosing  it. 

Component  probes  may  be  composed  of  predefined  probe 
operation  modules  to  do  standard  calculations  (for  example, 
moving  averages)  and  then  to  forward  the  results  to  selected 
panels.  In  order  to  automate  the  composition  of  probes  to 
accomplish  such  operations,  each  of  these  operations  is 
chained  together  by  invoking  the  method  for  that  probe  that 
is  associated  with  the  system-defined  message  name  of  the 
generic  next  operation.  Thus,  the  :  trigger  method  calls  the 
:  calculate  method  of  the  probe  which,  in  turn,  calls  its 
■.select  method  which,  finally,  calls  the  :  update  method  of 
the  selected  panels  associated  with  the  probe.  Probes  are 
composed  by  naming  them  as  specializations  of  appropriate 
probe  operation  modules  (for  example  a  :  calculate  module 
for  moving  averages)  as  desired.  The  default,  if  no 
specializations  are  stipulated,  is  to  pass  through  information 
without  change  to  all  the  panels  associated  with  a  probe. 

Information  flow  between  components  and  panels  is 
accomplished  by  the  component  probes  associated  with  each 
instrumented  component.  The  creation  of  such  component 
probes  and  their  association  with  appropriate  components  (by 
execution  of  :  add  methods)  accomplishes  the  instrumentation 
of  a  circuit.  This  is  done  when  an  instrument  is  created. 
During  simulation  initialization,  the  components  of  the  circuit 
(and  their  sub-components)  to  be  instrumented  are 
(recursively)  examined  by  each  template  probe  defined  for  the 
instrument  to  see  if  they  are  to  be  monitored.  If  so,  the 
:copy  method  for  the  given  template  probe  is  invoked  to 
create  a  new  instance  of  the  appropriate  component  probe  and 
add  it  to  the  probes  connected  to  the  component.  Each 
template  probe  previously  received  the  identifiers  for  the 
panels  to  which  its  clones  should  send  information.  These 
will  be  the  panels  identified  when  a  component  probe  invokes 
the  :  update  method 


4.2.  Instrument  Specifications 

The  operations  performed  by  an  instrument  panel  are  to: 

•  Find  information  previously  stored  according  to 
the  component  pointer  supplied  by  the  :  update 
method; 

•  Link  new  data  structures  as  needed  (to  save  such 
information)  to  other  such  structures  of  the  panel; 

•  Save  in  these  data  structures  the  results  of 

expressions  that  reference  indicated  keyed 

information  from  the  :  update  parameters  id  the 
prior  contents  of  the  structures; 

•  Send  the  results  of  periodic  analyses  on  the 
information  associated  with  a  panel  for  display  by 
the  same  panel  or  by  some  other;  and 

•  Show  processed  information  in  the  manner 
specified  for  the  panel. 

The  defaults  for  the  panel  operations  supply  the  most 
commonly  required  specifications  implicitly,  so  simple 
operations  are  simply  specified.  These  defaults  can  be 
overridden  as  needed  and  either  predefined  or  user  specified 
alternatives  for  the  panel  operations  can  be  selected  in  their 
place.  Arbitrarily  complex  (Lisp)  expressions  can  be  used  to 
specify  the  transformations  hetween  the  information  provided 
by  a  probe  and  that  saved  and  displayed  by  the  panel. 

These  transformations  and  all  the  default  overrides  for  the 
panel  operations  that  are  stipulated  in  the  instrument 
declaration  are  scanned  when  a  new  instrument  is  created  for 
a  simulation  session.  They  are  compiled  at  that  time  into 
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code  bodies  referenced  by  run  time  control  blocks  associated 
with  each  panel.  A  simulated  system  is  instrumented  by 
examining  all  of  its  components  and  attaching  to  each 
component  the  copies  of  template  probes  specified  by  the 
instrument  definition  that  are  appropriate  for  the  component 
(by  means  of  calls  on  the  :copy  and  :add  methods  for  the 
probe).  This  can  be  a  many  to  many  relationship  as  shown 
in  figure  4-2. 

Component  probes  to  measure  "load"  and  "latency"  are 
specified  in  the  given  example  for  each  operator  and 
evaluator  in  the  circuit.  The  "load  and  current  connection 
for  each  net-cutput  is  also  to  be  monitored.  Some  panels,  for 
example  the  one  showing  "consumer-limited"  processes, 
receive  inputs  from  only  one  type  of  component  probe,  those 
measuring  evaluator  latency.  Others,  such  as  the  one 

measuring  "process-latency"  receive  inputs  from  more  than 
one  kind  of  probe  (in  this  case,  from  probes  measuring 
operator  latency  as  well  as  those  measuring  evaluator  latency). 
A  way  must  thus  be  provided  to  distinguish  the  type  of  probe 
sending  information  to  a  panel;  this  is  described  in  the  next 
section. 

Some  probes  send  information  to  only  one  panel,  for 
example,  the  net-output  connection  probes.  Others  monitor 
information  which  Is  .leetitd  by  several  ItBl.'JlS,  for  eja  .  ,'v, 
the  operator  latency  probe.  Transformation  of  the  raw 
information  provided  by  a  probe  will  need  to  be  specialized 
to  the  information  expected  hy  each  panel  receiving  it.  A 
general  way  to  stipulate  these  transformations  is  stipulated  in 
the  next  section. 


evaluators  of  the  system. 

The  balance  between  the  "availability"  of  the  evaluator  and 
operator  of  each  site,  that  is,  the  complements  of  their 
respective  loads,  is  displayed  during  the  simulation  as  events 
are  processed  that  change  this  measure.  In  order  to  avoid 
capturing  information  at  too  fine  a  temporal  granularity, 
previously  gathered  information  for  a  given  site  is  overwritten 
if  it  is  within  a  given  sampling  interval  of  the  new 
information.  Information  that  is  beyond  a  given  history 
range  is  dropped.  The  scale  of  availabilities  displayed  is  fixed 
between  0  and  1.0.  The  panel  specification  to  declare  all  this 
and  to  also  stipulate  the  axis  labels  is  shown  in  figure  5-2. 


5.2.  Scrolling  Line  Plot  Panels 

An  example  of  a  scrolling  line  plot  panel  is  shown  in  the 
right  half  of  figure  5-1.  This  panel  sums  the  loads  seen  by 
the  icsjurrcs  in  the  simntaied  system  and  displays  this  as  a 
strip  chart,  the  "system  history".  Some  of  the  same  probe 
load  information  used  by  the  previous  panel  is  used  in  this 
panel  as  well,  but  with  different  transformations  defined  in 
the  panel  specification  as  shown  in  figure  5-3. 

Line  plu  may  Lmi  iw(  independently  scaled  vertical 

axes.  For  the  system  history  panel  shown,  the  sum  of  network 
loads  as  indicated  by  the  net-output  components  of  the  system 
is  plotted  against  the  left  axis  and  the  sum  of  the  processing 
loads  provided  by  the  current  average  of  the  sums  of  the 
operator  and  evaluator  loads  is  plotted  against  the  right  axis. 


panels 


probes 


components 


Figure  4-2:  Instrument  Probe  and  Panel  Relationships 


5.  EXAMPLE  PANELS 

Some  example  panels  are  described  in  this  section  to  give  a 
feel  for  the  instrumentation  possibilities  available  in  CAKfc 
and  elaborate  on  how  the  requirements  described  in  the 
previous  section  for  probe  type  identification  at  a  panel  and 
per  panel  specialization  of  the  information  provided  by  a 
probe  are  handled. 


5.1.  Point  Plot  Panels  „  „  .  . 

The  first  panel  (shown  in  the  left  half  of  figure  5-1)  is  an 
example  of  a  point  plot  panel  used  to  generate  a  scatter  plot. 
As  an  option,  only  points  representing  simulated  activity  over 
a  limited  past  history  from  the  most  recent  event  time  are 
kept  for  display.  In  this  example,  resource  load5  information 
is  provided  by  the  operator-load  and  evaluator-load 
component  probes  attached  respectively  to  the  operators  and 


Event  time  is  plotted  on  the  horizontal  axis.  The 
update-history  function  uses  the  component  pointer  to  find 
the  information  previously  saved  for  that  component  and 
records  the  current  event  time  as  the  (:  simulator  -.time)  so 
that  it  may  be  used  to  display  information  correctly  on  the 
horizontal  axis.  The  current  sums  of  the  evaluator  loads  and 
the  operator  loads  measured  by  the  system  are  stored  in  a 
record  for  the  given  event  time  (or  a  prior  event  time  within 
the  specified  sampling  interval)  by  the  calls  to  the  save-sum 
function  specified  as  part  of  the  save  operation. 


ource  load  is  defined  as  (1  -  1  /  0  +  aggregate-queue-length))  where 
jregate  queue-length  is  the  sum  of  Ihe  lenglhs  of  all  queues  provtding 
or  Ihe  resource. 


Ill 
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Figure  5-1:  Point  Plot  and  Scrolling  Line  Plot  Panels 


'((("Operator")  (0  1.0)  (-  1  ( :operator-load  : busy ) ) )  ; Bottom  axis 

(("Evaluator")  (0  1.0)  ((-  1  ( : avaluator-load  :busy))))  \ Left  axis 
[find  (f Ind-sampla-dl Jtlnct  ([simulator  :t1me)  .sampling-interval) 

:show  (recent-history  (tslmulator  : time )  , point-panel-history-range  0)) 


Figure  5-2:  Site  Correlation  Panel  Specification 


'((("Simulated  Time  [us]")  (.history-range)  ([simulator  ttlme))  ; Bottom 
(("Network")  (0  .sites)  ( :net-output-load  ;busy  save-sum))  -.Left 
(("Processing")  (0  .sites)  .Right 

(average  ( :evaluatar-load  :busy  save-suml 

( [operator-load  [busy  save-sum))) 

•.find  (update-history  ([Simulator  :t1ma)  .sampling-interval) 

[Show  (recent-history  ([Simulator  -.time)  .history-range  0)) 


Figure  5-3:  System  History  Panel  Specification 


5.3.  Self  Scaling  Line  Plot  Panels 

Figure  5-4  illustrates  both  the  self  scaling  of  displays  and  the 
use  of  a  display  analysis  operation.  For  this  self  scaling  line 
plot  panel,  two  pieces  of  data  are  collected  for  each  operator 
in  the  system:  the  load  on  the  operator,  shown  on  the  right 
axis,  and  the  latency  of  the  information  it  has  most  recently 
received.  This  last  item  is  provided  by  the  operator  latency 
probe  in  two  parts:  (1)  the  interval  between  the  creation  of 
the  information  and  its  receipt  by  the  net-input  feeding  the 
operator  and  (2)  the  interval  between  such  receipt  and  the 
operator  taking  action  on  it.  There  are  thus  two  curves 
plotted  on  the  left  axis.  The  specification  stipulates  a  list  for 
the  left  axis  display.  The  elements  of  this  list  are  the  "net 
delay"  and  the  sum  of  this  measure  and  the  "operator  delay" 
monitored  by  the  operator  latency  probe.  Since  both  delays 
are  non-negative,  their  sum  must  be  at  least  as  large  as  either 
one  taken  alone:  the  two  curves  may  be  superimposed  but  can 
not  cross.  The  difference  between  the  two  curves  is  the 
incremental  delay  added  by  the  operator. 


The  panel  specification  for  the  operator-network  panel  is 
shown  in  figure  5-5.  In  addition  to  transformations  shown 
previously,  ?ti  analysis  function  is  stipulated  for  the  send 
operation  of  the  panel.  The  information  saved  from  each  of 
the  probes  sending  [update  messages  to  the  panel  is  to  be 
sorted  from  the  greatest  to  the  least  values  of  the  associated 
sum  of  delays  described  above  This  information  is  to  be 
saved  as  the  operator  latency  rank  and  used  as  such  to 
determine  the  position  on  the  horizontal  axis  that  the  delay 
and  load  information  will  be  displayed. 
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Figure  5-4:  Self  Scaling  Line  Plot  Panel 


'((("Operators")  (1  .sites)  ( :operator-latency  :rank)) 

((("latency"  "us"))  (0  nil)  -.Second  string:  90  degree  baseline  shift 
j(  loperato— latency  (:net-da1ay  (*  mat-delay  :operator-dalay))))) 
(("Load")  (1  1.0)  ( :operator-load  -.busy)) 
mend  (sort-arrays 

((,#'>  ( :operator-1atency  (+  :net-delay  :operator-delay)))) 
((-.operator-latency  -.rank)))) 


Figure  5-5:  Operator-Network  Panel  Specification 
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Figure  5-6:  Boxes  and  Lines  Panel  and  Scrolling  Text  Panel 


5.4.  Boxes  and  Lines  Panels 

Perhaps  the  most  intuitively  satisfying  of  the  types  of  panels 
available  is  the  boxes  and  lines  panel,  a  graphic 
representation  of  a  circuit  showing  its  components  and  their 
interconnections.  An  example  of  such  a  panel  is  shown  the 
left  part  of  figure  5-6.  This  class  of  panels  uses  information 
left  behind  by  the  structure  editor  when  the  circuit  was 
defined.  Its  form  is  thus  automatically  generated.  The 
position  of  the  components  ("boxes")  and  the  connections 
between  them  ("lines")  in  the  display  are  used  to  animate 
system  operation.  In  the  example  shown,  the  shading  (or 
color)  of  the  boxes  is  used  to  indicate  the  availability  of  the 
evaluators  in  the  simulated  system  as  the  simulation  proceeds. 
Darkest  shades  indicate  highest  availability,  that  is,  empty 
queues  for  utilization  of  the  resource;  lighter  shades  indicate 
lower  availability,  that  is  longer  queues.  The  lines  between 
boxes  indicate  communication  paths  that  are  in  use,  that  is, 
not  ": free"  at  the  time  of  the  most  recent  show  operation 
for  the  panel. 


The  panel  specification  for  the  mapping  panel,  an  instance  of 
a  boxes  and  lines  panel,  is  shown  in  figure  5-7.  There  are 
two  specifications  for  the  panel:  one  for  the  boxes  and  one 
for  the  lines.  The  specification  for  boxes  in  Tie  panel 
stipulates  that  the  availability  of  evaluators  in  the  sites 
conesponding  to  the  boxes  displayed  controls  the  shading  of 
those  boxes.  The  scale  is  defined  to  run  from  0  to  1.0.  The 
specification  for  lints  in  the  panel  uses  the  connection 
information  reported  for  the  net-output  to  determine  line 
placement  on  the  display.  When  the  status  is  reported  as 
:free,  the  connection  information  is  dropped  from  the  panel 
and  the  corresponding  lines  are  removed. 


5.5.  Scrolling  Text  Panels 

Sometimes,  the  most  appropriate  way  to  display  information 
is  to  show  it  as  ten.  Based  on  a  similar  facility  provided  by 
the  underlying  Lisp  system,  the  scrolling  text  panel  provides  a 


(((“Evaluator  Available")  (0  1.0)  (-  1  ( : eval uator- 1 oad  :busy)))) 
((("Packet  Trace")  nil  ( ; net-ou tput-connec t Ion  :po1nts)) 

(("Packet  Status")  nil  v : net-output-connection  :status)) 

:f1nd  ( f 1 nd-and-remove  ,#‘eq  (; net-output-connect  1  on  rstatus)  : t  ree ) ) ) 


Figure  5-7:  Mapping  Panel  Specification 
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scrollable  window  Into  lines  of  text.  In  the  right  part  of 
figure  5-6.  the  delay  in  each  process  execution  while  waiting 
for  something  to  do,  that  is,  the  event  time  interval  spent 
waiting  for  an  appropriate  task  to  appear  on  a  certain  stream 
of  tasks,  is  shown  together  with  the  process  that  finally 
produced  the  awaited  work.  This  information  is  sorted  so 
that  the  text  lines  appear  from  the  greatest  stream  waiting 
interval  to  the  least. 

The  values  and  formats  used  for  display  in  a  scrolling  text 
panel  are  defined  much  as  in  previously  defined  panels. 
Format  control  strings  take  the  place  of  scale  information. 
As  usual,  values  are  described  hy  a  list  of  forms,  each  one  of 
which  specifies  the  transformations  to  perform  on 
information  received  from  prohes.  The  example  specification 
in  figure  5-8  shows  the  generality  with  which  probe 
information  can  be  incorporated  in  Lisp  expressions  to 
produce  transformation  specifications.  The  information  used 
to  generate  the  value  for  the  second  field  of  the  text  display 
is  based  on  the  origin  of  the  task  packet  that  arrived  on  the 
stream  the  process  was  waiting  for. 


Many  of  the  CARE  parameters  are  specified  as  overrides  If 
not  specified,  the  corresponding  performance  is  taken  as 
measured  on  the  simulation  machine.  Thus,  the  evaluation 
override,  that  is,  the  time  to  perform  an  evaluation  can  be 
specified  as  non-nil  in  order  to  fix  the  time  that  each  user 
evaluation  will  take.  (This  is  useful  in  making  runs 
repeatable  for  debugging).  The  time  that  it  takes  to  switch 
context  can  be  specified  as  the  stack  group  switch  override. 
Similarly,  the  time  to  create  a  process  control  block  and  a 
stack  context  for  that  process  can  be  taken  as  given  rather 
than  measured  by  specifying  respectively  the  process  block 
creation  override  and  the  stack  group  creation  override. 

The  time  required  for  operator  execution  is  modeled  in  terms 
of  the  number  of  words  the  operator  must  manipulate  in 
handling  a  given  message.  The  manipulation  time  per  word  is 
specified  by  the  operator  word  touch  time.  Lastly,  the 
performance  of  the  communication  subsystem  is  specified  as 
communication  cycles.  This  is  done  in  terms  of  the  minimum 
number  of  evaluator  data  path  clock  times  (that  is,  event 


■((()  ("-4D  -A") 

((fix  ( : stream-wal t 1 ng  ilntarval))  -.first  field 
(let*  ((origins  (packet-origin  ('.stream-waiting  :  packet) ) ) 
(origin  (If  (llstp  origins)  (first  origins)  origins))) 
(remote-address-local  origin))))  -.second  field 
:send  (sort-arrays  ((,#’>  ( : s tream-wa 1 1 1 n g  : Interval)))  nil)) 


Figure  5-8:  Producer  Limited  Process  Panel  Specification 


5.6.  Noting  Simulation  Parameters 

The  CARE  component  models  are  parameterized  through 
menu  interaction  as  shown  in  figure  5-9  to  allow  easy 
variation  of  their  performance  characteristics  relative  to  each 
other.  Additionally,  the  site  model  parameterizes  alternative 
routing  strategies:  directed,  that  is,  blocking  when  progress 
can  not  be  made  toward  the  goal;  spiraling  around  the  goal  if 
progress  toward  it  is  blocked;  and  dithering,  that  is,  routing 
away  from  the  goal  even  if  only  the  last  link  towards  it 
remains  to  be  acquired.  The  rate  at  which  each  site  accepts 
application  data  is  also  a  parameter,  the  data  rale  and  can  be 
used  by  an  application  to  control  how  hard  it  drives  the 
simulated  system. 


times)  required  for  a  32-bit  word  to  pass  a  given  point  in  the 
network.  Thus  the  parametric  specification,  "4 

communication  cycles”,  dictates  that  8  bits  may  cross  such  a 
boundary  each  time  the  evaluator  passes  through  one  event 
time.  If  the  communications  path  were  narrower  or  the  base 
communication  clock  rate  were  lower,  a  higher  number  would 
be  specified. 

The  last  example  of  SIMPLE  panels  is  the  annotation  panel  as 
illustrated  in  figure  5-10.  This  is  used  to  (automatically) 
record  the  date,  time,  and  parameters  of  the  simulation  run  as 
well  as  any  other  information  the  user  chooses  to  keyboard 
into  it. 
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Figure  5-9:  Parameter  Menu 
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figure  5-10:  Annotation  Panel 
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Figure  5-11:  Overseer  Instrument 


5.7.  An  Instrument  Screen 

All  these  panels  are  put  together  in  an  instrument  screen 
according  to  a  set  of  layout  constraints  manipulated  by  the 
underlying  window  system.  The  finished  screen  might  look 
like  figure  5-11.  The  instrument  screen  is  redrawn  at  a  rate 
set  by  the  user.  By  experience,  it  is  often  better  to  update  the 
screen  at  a  frequency  low  enough  to  let  the  user  interpret  each 
screen  comfortably  than  at  the  maximum  rate  possible.  1  his 
approach  also  restricts  the  computing  resources  consumed  by 
the  instrumentation  system.  More  focused  approaches  to 
controlling  instrumentation  load  on  the  system  include  the 
ability  to  freeze  selected  panels  and  disconnect  selected  probes 
during  a  simulation  run. 


6  USING  PROGRAM  DEVELOPMENT 
TOOLS 

The  SIMPLE/CARE  simulation  system  is  integrated  into  the 
underlying  Lisp  machine  program  development  enviionment 
The  objects  and  data  structures  at  both  the  component  model 
and  application  language  interface  have  abstraction  interfaces 
that  provide  summary  state  information  when  they  are 
displayed  in  text  form.  These  text  abstractions  are  mouse 
sensitive"  in  the  development  machine  environment  and  so 
can  be  inspected  at  successively  finer  levels  of  detail  as 
desired. 

In  figure  6-1,  the  net-output  components  of  the  site  at  grid 
coordinates  (3  2),  the  particulars  of  the  net-output  on  the  east 
side  of  the  site  (that  is,  net-output-3),  and  a  summary  of 
all  the  sub-components  of  the  site  at  (3  2)  are  being 


ispected.  This  same  kind  of  view  into  the  progress  of  a 
mulation  is  provided  in  the  debugging  process  and  may,  as 
rown  in  figure  6-2,  refer  to  the  conceptual  entities  of  the 

.  •  .1  •  _ _  U  „  Cl/ctpm 


In  the  example  shown  in  figure  6-2,  a  distributer  process 
running  on  the  evaluator  at  site  (1  1)  has  made  an  improper 
call  on  the  update-locale  function  during  execution  ot  its 
start  method.  It  might  have  been  appropriate  to 
investigate  this  situation  in  terms  of  the  modeled  components. 
That  could  be  done,  for  example,  using  the  debugger  to 
inspect  the  evaluator  component,  its  enclosing  site,  related 
net-output  components,  or  whatever  else  at  the  component 
model  level  seemed  relevant.  In  this  case,  what  was  done  was 
to  use  a  few  mouse  clicks  to  indicate  interest  in  the  source 
file  for  the  distributer  :  start  method  generating  the 
problem.  It  was  brought  up  for  review  and  control  was  then 
transferred  to  an  editor  using  the  underlying  program 
development  environment  as  shown  in  figure  6-3. 


Because  of  the  implementation  system  chosen  for  the 
realization  of  SIMPLE/CARE,  at  any  point  in  the  simulation, 
procedures  either  in  the  application  or  in  the  component 
models  can  be  modified,  incrementally  recompiled  (within  a 
few  seconds),  and  be  made  effective  for  all  calls  on  them 
—  even  those  in  the  interrupted  stack  frame.  Thus  simulation 
execution  can  be  backed  up  to  some  previous  point  in  the 
stack  frame  and  retried  (given  that  intermediate  side  effecting 
code,  if  any,  is  safely  re-executable). 
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Figure  6-1:  Inspecting  Simulated  Components 
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Figure  6-3:  Changing  Application  Code 


7.  CONCLUSIONS 

The  goals  of  simulation  flexibility  and  simulation 
environment  completeness  have  been  dealt  with  in  the  ways 
described  throughout  this  paper.  In  summary,  the  system  ts 
flexible  in  that  it  supports: 

•  Arbitrary  data  types  and  lengths  in  simulation. 

The  information  whose  flow  and  creation  is 
controlled  by  simulated  components  may  be  of 
arbitrary  complexity  from  numbers  and 

keywords  to  procedure  bodies  and  execution 
environments. 

•  Instantaneous  effect  of  definition  change  at  both 
the  application  and  component  modeling  level 
(even  during  a  simulation  run). 

i  A  broad  range  of  instrumentation  customization. 
Customizations  may  involve  arbitrary  expressions 
for  probe  data  transformations,  many  to  many 
probe  to  panel  mappings,  information  from 
summary  analyses  on  one  panels  data  included  in 
another,  and  control  of  what  state  is  saved  and  for 
how  long. 

.  Separation  of  probe  and  component  definitions  to 
facilitate  their  independent  modification. 

.  An  application  language  interface  that  is  easily 
extended  or  changed  without  recasting  the 
information  flow  control  described  by  the 
component  behaviors. 

While  there  is  always  room  for  additional  capability6, 
S1MPIE/CARE  is  a  usefully  complete  system.  It  now 
includes: 


>  Supplied  components  for  a  network  multiprocessor 
simulation  with  many  of  their  parameters 
customizable  by  menu  interactions. 

.  A  hierarchical  structure  editor  that  currently 
provides  automatic  grid  and  torus  composition 
operators.  (Automated  composition  of  richer 
topologies,  such  as  hypercubes,  has  been  provided 
for  in  the  basic  design). 

.  A  rule  language  that  supports  a  synchronous  design 
style  without  incurring  the  overhead  of  (naive) 
synchronous  simulation. 

.  Method  invocation  for  functional  simulation  that 
is  integrated  into  the  behavioral  simulation  rule 
system  and  wh  ch  provides  for  operations  by  and 
on  both  local  and  hierarchically  related 
components. 

.  Method  specification  design  aids  provided  by  the 
underlying  program  development  environment  (for 
example,  method  dictionaries  and  quick  access  to 
method  sources  from  the  debugging  system). 

.  An  evolved  set  of  panel  templates  providing 
sorted,  scrollable  text  lines  as  well  as  self  and 
fixed  scaling,  "two  and  a  half”  dimensioned, 
history  sensitive  displays  which  may  be  scatter 
plots,  strip  charts,  line  graphs,  intensity  maps,  and 
signal  animations. 


5A  histogram  panel,  for  example.  is  jun  now  being  added  lo  the  system 


We  set  off  to  build  a  multiprocessor  simulation  system  with 
performance  adequate  for  the  understanding  of  multiprocessor 
systems  executing  significant  applications.  The 

SIMPLE/CARE  simulation  system  has  been  used  to  study  the 
operation  of  "expert  systems”  of  respectable  size  [Brown86], 
Depending  on  instrumentation  load,  these  studies  have 
involved  simulation  runs  from  20  minutes  to  several  hours 
each  While  faster  would  surely  be  better,  performance  has 
proven  adequate  to  these  needs. 
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Abstract 


Choosing  a  multiprocessor  interconnection  topology  may 
depend  on  high-level  considerations,  such  as  the  intended 
application  domain  and  the  expected  number  of  processors.  It 
certainly  depends  on  low-level  implementatin  details,  such  as 
packaging  and  communications  protocols.  We  first  use  rough 
measures  of  cost  and  performance  to  characterize  several 
topologies.  We  then  examine  how  implementation  details  can 
affect  the  realizable  performance  of  a  topology. 

1  Introduction — Design  Constraints  and 

Opportunities 

The  base  for  development  of  general  purpose  multiprocessor  systems  as 
for  computer  systems  today  generally  is  given  by  the  design  constraints  . 
and  opportunities  established  by  evolving  semiconductor  design  and 
manufacturing  processes.  The  VLSI  design  medium  brings  a  new  per¬ 
spective  on  cost:  switches  are  cheap;  wires  are  expensive.  In  modern 
microprocessors,  communication  costs  dominate  those  associated  with 
logic.  Power  and  cooling  budgets  are  spent  driving  wires  and  over 
whelmingly,  chip  area  is  dedicated  to  wiring  rather  than  logic  [17].  To 
an  increasing  degree,  the  dominant  delays  are  associated  with  driving 
lines  rather  than  the  accomplishment  of  logic  functions  per  se.  One 
implication  is  that,  all  other  things  being  equal,  smaller,  simpler  pro¬ 
cessors  can  be  expected  to  have  shorter  operation  cycles  than  larger, 
more  complex  designs  [18].  They  are  also  likely  to  he  availahle  in  a 
more  recent,  higher  performance  base  *c,hnology 

At  the  system  level,  the  consequence  of  relatively  expensive  com¬ 
munication  is  that,  performance  is  enhanced  if  the  design  establishes 
that  whenever  a  lot  of  information  has  to  move  in  a  short  time,  it  does 
not  have  to  move  far.  Significant  locality  of  high  bandwidth  links  is 
a  goal.  Among  the  highest  bandwidth  links  in  a  computer  system  is 
that  connecting  the  processor  and  memory.  Early  computer  systems 
separated  these  pieces  and  put  a  bottleneck  between  them  to  accommo¬ 
date  the  packaging  realities  of  the  time:  processors  were  implemented 
with  electronic  means,  memory  with  magnetic,  and  their  power  require¬ 
ments  and  EMI  characteristics  were  best  dealt  with  separately.  There 
are  new  realities  i.nw:  close  coupling  of  processors  with  local  memory 
is  preferred. 

With  these  design  constraints  in  mind,  we  consider  a  multicomputer 
implementation  based  on  a  set  ol  processor/memory  pairs  connected  by 
a  communications  topology.  Many  topologies  have  been  proposed  [8] 
and  have  been  compared  in  tern..»  of  theoretical  cost  and  performance 
measures  [16].  We  argue,  however,  that  the  realizable  performance  of 
these  topologies  are  closely  linked  *o  details  of  system  packaging. 

•This  work  was  supported  by  DARPA  Contract  h 30602-85-C-0012,  NASA  Aines 
Contract  NCC  2-220-SI,  and  Boeing  Contract  W26C>875. 

t Supported  by  an  NSF  Graduate  Fellowship  and  by  the  Stanford  Dept,  of  Elec¬ 
trical  Engineering. 


2  Interprocessor  Connection  Topologies 

Connection  schemes  between  processing  sites  can  be  compared  with 
respect  to  their  cost  and  performance  as  a  function  of  the  number  of 
sites  connected.  For  a  particular  connection  scheme,  if  the  cost  grows 
no  faster  than  the  number  of  sites  and  the  performance  grows  at  least 
as  fast,  that  scheme  can  be  described  as  scalable.  A  rough  measure  of 
cost  is  the  number  of  input-output  ports  required  for  connection.  A 
rough  measure  of  performance  is  the  number  of  links  in  the  topology 
divided  by  the  largest  number  of  links  that  must  be  traversed,  and  thus 
occupied  to  accomplish  a  transmission,  in  order  to  get  from  one  node 
in  the  network  to  another.  This  indication  of  the  bound  on  the  number 
of  independent,  concurrent  transmissions  we  will  call  the  concurrency 
of  the  network. 

For  some  topologies,  the  concurrency  of  a  network  may  understate 
performance  as  actually  experienced  in  a  given  application:  to  the  ex¬ 
tent  that  there  is  locality  of  reference  in  transmissions,  the  number  of 
links  actually  traversed  may  be  better  approximated  by  a  constant  than 
some  function  of  the  number  of  connected  sites.  Network  concurrency 
may  also  overstate  performance  of  one  topology  with  respect  to  an¬ 
other:  to  the  extent  that  the  time  to  traverse  links  is  not  the  same  for 
all  topologies,  those  that  have  non-uniform  link  costs  [perhaps  due  to 
physical  distance  considerations  applied  to  the  realized  lengths  of  links) 
will  deliver  less  performance  than  the  concurrency  measure  suggests. 
This  is  because  in  these  cases,  logical  adjacency  due  to  high  dimension¬ 
ality  is  merely  apparent — embedding  the  topology  in  the  dimensionality 
of  space  available  tends  to  incur  just  those  expenses  related  to  physical 
distances  that  the  topology  was  expected  to  eliminate. 

2.1  Topologies  With  Scalable  Concurrency 

Several  topologies  are  shown  in  Table  1  which  have  scalahle  concur¬ 
rency.  As  the  numher  of  sites  is  increased,  the  network  grows  enough 
to  support  the  consequential  additional  traffic.  In  fact,  by  this  measure 
of  performance,  the  last  three  of  these  four  topologies  scale  performance 
equally  well.  However,  as  will  be  dcscrihed,  there  are  other  considera¬ 
tions  to  weigh. 


Table  1:  Scalable  Concurrency  Topologies,  [n  -  #  processors] 


Topology 

Number 
of  Ports 

Longest 

Path 

Concurrency 

Completely  connected 

o(n-) 

0(1) 

0(nl) 

Crossbar 

0(n-)“ 

0(1) 

O(n) 

Banyan 

0{n  log  n) 

O(logn) 

O(n) 

Boolean  k-c nbe  (n  =  2*) 

0(n  logn) 

O(logn) 

O(n) 

“The  number  of  links  is  O(n). 


In  the  crossbar  and  completely  connected  topologies,  the  number 
of  ports,  a  first  approximation  to  cost,  grows  quadratically  with  the 
number  of  nodes  in  the  network.  Weighing  cost  and  concurrency,  then, 
we  might  prefer  the  banyan  and  boolean  fc-cube  (also  known  as  hy¬ 
percube”)  topologies. 

By  these  measures,  there  does  not  seem  to  be  a  clear-cut  choice 
between  the  banyan  and  the  hypercube.  A  more  sophisticated  mea¬ 
sure  of  cost  would  take  into  account  the  area  required  for  laying  out 
the  topology  in  a  plane  [11].  The  banyan  may  have  a  slight  edge  in 
this  category1,  but  both  layouts  require  relatively  long  wires,  which  is 
undesirable  if  link  transit  time  dominates  switching  time.2 

A  major  difference  between  the  two  topologies  is  that  switching  and 
routing  are  centralized  at  the  processor  in  the  hypercube,  whereas  the 
switching  in  the  banyan  is  distributed  throughout  the  network.  To  the 
extent  that  storage  is  required  at  the  switch  (as  in  [3]),  it  becomes  more 
economical  to  centralize  the  switch  and  utilize  the  local  storage  of  the 
processor.  For  this  reason,  we  prefer  the  hypercube. 

2.2  Topologies  With  Scalable  Cost 

There  are  alternative  topologies  not  as  richly  connected  as  those  just 
considered.  The  topologies  in  Table  2  all  have  fixed  degree  connectivity, 
so  they  all  have  scalable  cost  as  measured  by  port  count.  Unfortunately, 
none  of  them  has  scalable  concurrency.  So,  at  least  among  the  ten 
representative  topologies  discussed,  there  is  no  topology  that  has  cost- 
performance  characteristics  intrinsically  superior  to  all  the  others. 

Concurrency  for  t.he  ring  and  the  bus  topologies  does  not  increase 
at  all  as  ihc  number  of  processors  increases.  Given  no  guarantee  of 
transmission  source  to  target  locality,  these  seem  unsuitable  for  systems 
with  a  large  number  of  processors  (e.g.,  >  100). 

The  perfect  shuffle  and  cube-connected  cycles  (CCC)  topologies  em¬ 
ulate  the  O(logn)  latency  of  the  hypercube,  but  the  number  of  links 
is  linear  with  the  number  of  processors,  so  concurrency  does  not  scale, 
Also,  if  we  measure  cost  in  terms  of  layout  area,  the  cost  of  the  perfect 
shuffle  (O(|o-"7in))  and  CCC  (0(E-^-))  [15]  do  not  scale  and  so  will 
not  be  considered  further. 

The  tree,  grid,  and  torus  topologies  all  have  fixed  degree  connec¬ 
tivity  and  have  the  optimum  O(n)  area  requirement. The  tree  has  a 
slightly  better  capacity  measure  and  a  lower  latency  bound.  Note,  how¬ 
ever,  that  the  tree  provides  no  alternate  communication  paths  (useful 
in  network  balancing  and  defect  tolerance)  and  has  a  bottlenecking 
root.3  Connections  might  be  added  to  provide  alternate  paths,  but,  as 
we  will  see  in  the  next  section,  physical  link  considerations  may  make 
the  grid  or  toms  a  better  choice. 


Table  2:  Scalable  Cost  Topologies,  [n  =  #  processors] 


Topology 

Number 
of  Ports 

Longest 

Path 

Concurrency 

Area 

Ring 

O(n) 

O(n) 

O(l) 

0(71) 

Global  bus 

O(n) 

0(1) 

0(1) 

O(U) 

Perfect  shuffle 

O(n) 

O(logn) 

_ 

°<  .7  ’ 

Cube-connected  ry  fes 

0(„) 

Oflogii) 

°<i£d  ... 

Binary  tree 

O(n) 

0(log7l) 

0(77) 

Grtd/Torus 

0(7  0 

o(, AO 

0{Jn) 

0(71) 

1  The  area  required  to  1  -v  out  a  hyperenbe  in  a  plane  is  0(n2)  [2],  where  n  is  the 
number  of  processors.  Sint  'banyan”  actually  denotes  a  class  of  interconnections  it 
is  flillirnlt  to  make  a  genera!  statement  about  its  layout.  However,  let  us  consider  a 
particular  banyan  network,  the  omega  network  [10],  which  is  logu  stages  of  perfect 
shuffle connections.  The  perfect  shulfle  lias  area  0{  ^  )  [15],  so  we  would  expect 

logu  perfect  slinlHes  to  require  area  Q (  ^  ■  ■ ) ,  which  is  a  slightly  better  bound 
than  for  the  hypercube.  Other  types  of  banyans,  with  different  fan-in,  fan-out.  and 
connectivity  characteristics  might  have  even  smaller  bounds. 

2 See  Section  3. 

7 We  might  be  able  to  deal  with  this  bv  increasing  the  bandwidth  of  the  links  as 
we  proceed  toward  the  root,  for  example  with  “fat  trees  [12]. 


3  Link  Costs — Examining  The  Free  Lunch 

Most  studies  of  topologies  assume  a  constant  cost  for  link  traversals 
as  the  number  of  links  increases.  This  is  a  useful  approximation  if  the 
time  to  drive  and  receive  link  signals  is  constant  with  link  length  and 
large  compared  to  signal  transit  time  on  the  link.  However,  this  is  in¬ 
creasingly  not  a  good  assumption  both  as  the  underlying  feature  size 
of  the  component  technology  decreases  and  as  we  consider  larger  num¬ 
bers  of  sites  in  a  system.  Given  a  fixed  circuit  feature  size,  topologies 
with  scalable  concurrency,  as  discussed  in  Section  2.1  suffer  increased 
link  lengths  and  thus  longer  signal  transit  times — with  possibly  increas¬ 
ing  drive  times — as  the  number  of  processors  increases.  Alternatively, 
given  a  fixed  volume  of  circuits  in  these  topologies  and  decreasing  cir¬ 
cuit  feature  size,  the  number  of  processors  in  the  system  increases  but 
so  does  the  ratio  between  link  lengths  and  feature  size  Thus  relative 
to  the  circuit  delay  times  which  are  dependent  on  (and  decrease  with) 
circuit  feature  size,  the  link  transit  times  become  increasingly  a  more 
important  consideration.4 

Topology  has  to  be  viewed  as  a  dependent  variable  determined  prin¬ 
cipally  by  the  packaging  technology  of  the  system.  As  an  example, 
consider  the  recursive-H  layout  for  the  binary  tree  (Figure  l)  under 
the  assumption  that  link  transit  time  dominates  switching  time.  Now 
consider  the  grid  in  Figure  2,  which  can  be  laid  out  in  the  same  area. 
If  transit  times  dominate,  then  shorter  links  and  more  switching  sites 
will  likely  shorten  the  point-to-point  communications  cycle  time  and 
improve  the  realized  capacity  of  the  network.5  Furthermore,  additional 
data  paths  allow  dynamic  routing  of  messages,  and  additional  comput¬ 
ing  resources  make  the  grid  potentially  more  powerful  than  the  tree. 

Though  the  torus  appears  to  suffer  from  extremely  long  wires  which 
“wrap  around”  the  edges,  a  simple  renumbering  of  the  processors  in 
a  grid  brings  each  one  within  two  hops  of  its  logical  neighbors6  (see 
Figure  3).  Thus,  we  can  effectively  create  a  torus  by  changing  the 
routing  algorithm  of  a  grid.  Alternatively,  we  could  keep  the  original 
torus  connections  and  lay  out  the  processors  as  in  Figure  3(h),  resulting 
in  links  which  are  at  most  twice  as  long  as  those  for  a  grid.  In  the 
remainder  of  the  paper,  we  will  speak  of  the  grid  bearing  in  mind 
construction  of  the  torus  in  these  terms. 


4  A  Packaging  Example 

We  are  now  faced  with  two  topologies:  one  with  scalable  performance — 
the  hypercube- — and  one  with  scalable  cost — the  grid.  The  arguments 
presented  above  suggest  that,  all  else  being  equal,  the  communication 
cycle  time  for  the  hypercube  would  he  greater  than  that  of  the  grid, 
due  to  its  long  links.  Kven  so,  the  average  message  latency  of  the 
hypercube  may  still  be  smaller,  due  to  its  high  connectivity.  To  get  a 
better  understanding  of  the  relative  performance  of  the  two  systems,  we 
should  examine  how  they  might  actually  he  implemented  in  near-future 
technology. 

In  the  ni id- 1 990  s  we  would  expect,  a  0.5-/mi  MOS  fabrication  pro¬ 
cess  to  he  available  [7].  We  will  assume  that,  the  complexity  of  our 
processor  is  comparable  to  today’s  typical  32-bit  microprocessor.  The 

4 Tin*  dependence  of  ronuminiral  ion  delays  on  signalling  lengt  lis  as  circuit  feature 
size  decreases  depends  on  assumptions  made  on  ihc  t  hirkness  and  llin*  tlie  resistivity 
of  associated  interconnects.  Uniform  scaling  leads  to  relative  signalling  times  that 
increase  quadratically  with  distance  [19].  Detailed  analysis  of  the  equations  of 
voltage  and  current  in  VLSI  wire  implementations  (including  consideration  of  the 
non-linear  characteristics  of  signal  drivers)  demonstrated  linear  dependences  [1]  hut 
were  done  assuming  that  the  interconnect  (and  field  oxide)  thicknesses  did  not 
decrease  at  all  while  all  other  dimensions  scaled  with  the  circuit  feature  size  of  the 
technology  [17].  Another  approach  imagines  a  hierarchy  of  interconnect  of  increasing 
thicknesses  with  distance  [13]  to  achieve  signalling  times  that  grow  only  with  the 
logarithm  of  the  distance.  Yet  another  approach  accepts  resistive  links  hut  given 
rontrol  over  both  minimum  and  maximum  wire  lengths  and  use  of  high  impedance 
receivers,  notes  that  it  is  possible  to  counter  dispersive  losses  with  reflective  voltage 
doubling  at  the  receiving  end  of  a  point  to  point  link  [9] 

5The  assumption  made  here  is  that  the  message  routing  is  relatively  independent 
of  the  computing  activities  at  a  processing  site,  so  there  is  no  penalty  associated 
with  being  routed  at  a  processing  site  rather  than  a  switch. 

6This  approach  is  attributed  to  H.  Zippel. 


Figure  1;  Recursive- H  binary  tree, 


Figure  2:  Two-dimensional  grid 


-oo}ho 

l}+-02  fjiOJl+TThOS  [-06 

Rlffi 

1  -Hl2l-Hl3'-j- 14  ■+■  15  -j- 16  --  17  -h 

fl«lt 

924" 

■SSI 

7rJ£~ 

-40  ^4  1  ""42  u 

-]o}!{mH52} 

-,’5T|^+{54H55}i^56Hs?^+J 

— 1  - 

-60H6 

— • i  *— i 

'  64  [9 

77p 

(a) 


1)0  -07  -01 


40^47-41  -46 

(b) 


Figure  3:  Torus  (a)  and  renumbered  grid  (b). 


Micro  VAX  78032  chip  [4],  for  example,  is  implemented  in  3-/im  tech¬ 
nology;  it  measures  about  8.5  mm  on  a  side.  Using  0.5-/rm  technology, 
we  could  expect  a  similar  processor  to  require  around  1.5  mm  on  a  side. 
Let  us  allow  256K  bytes  (2M  bits)  of  local  memory  for  our  processor. 
Fujitsu’s  megabit  RAM  using  1  4-//m  technology  takes  54  7  mnr  [(»].  If 
the  dimensions  of  the  Fujitsu  chip  are  about  10  mm  by  5.5  mm,  then 
a  0.5-/mi  version  would  be  3.6  mm  by  2  0  mm.  1  wo  of  these  (since  we 
want  2M  bits)  would  be  around  3.6  mm  by  4  mm.  As  an  approxima¬ 
tion,  then,  each  processing  element,  including  a  processor,  256K  bvtes 
of  local  memory  and  switching  and  routing  circuitry  could  be  expected 
to  fit  onto  a  5  mm  x  5  mm  piece  of  silicon. 

Even  as  devices  shrink,  die  sizes  continue  t.o  grow.  By  the  ntid-90  s, 
the  state-of-the-art  chips  may  be  as  large  as  15  mm  on  a  side.  Each 
chip  would  be  expected  to  have  400-600  I/O  pads  [14],  Therefore,  we 
could  put  up  to  nine  processing  sites  on  a  single  die. 

The  dice  could  be  flip-mounted  on  a  silicon  [5]  or  ceramic  [9;  sub¬ 
strate  with  tlun  film  transmission  lines  and  integrated  capacitors.  In 
[9],  the  maximum  length  for  5-,mi-thick  lines  is  around  20  mi,  so  we 
will  assume  a  10x10  cm  module  size,  on  which  we  can  easily  place  up 


to  36  dice.  We  will  assume  on  the  order  of  1000  I/O  pins  per  module 

[5]. 

Consider  first  packaging  a  (32x32)  1024-element  octal  grid,  ill  which 
each  processor  is  connected  to  eight  neighbors.  With  nine  processors 
(arranged  as  a  3x3  grid)  on  a  die,  32  (bi-directional)  communication 
links  must  come  off  the  chip  through  the  I/O  pads,  so  no  more  than  18 
pads  could  be  used  per  channel,  A  module  can  carry  324  processors, 
arranged  as  an  18x18  grid.  The  entire  system,  then  could  fit  on  four 
modules  (with  room  to  spare).  The  communications  links  from  two 
sides  of  the  18x18  grid  ( 105  bidirectional  channels)  must  go  ofr-inodule. 
Thus,  each  channel  could  use  10  pins-one  pin  for  clock  and  status 
information  and  four  for  data,  in  each  direction. 

Now  consider  a  1024-elenient  hypercube  (a  “10-cnhp”).  To  allow  for 
more  complex  wiring  and  easier  packaging,  we  will  assume  that  each  die 
contains  eight  processors,  and  each  module  will  hold  32  dice,  for  a  total 
of  256  processors  per  module.  (Extra  space  might  he  used  to  provide 
redundant  processors  for  fault  tolerance.)  Again,  only  four  modules 
are  required  to  package  all  1024  processors.  Each  processor  has  ten 


bidirectional  links  to  its  logical  neighbors.  If  the  eight  processors  on  a 
die  are  wired  as  a  3-cube,  then  seven  channels  from  each  processor  must 
go  off-chip.  Five  of  these  channels  are  connected  to  other  processors 
on  the  same  module,  but  two  must  go  off  the  module.  With  only 

1000  I/O  pins  for  512  bidirectional  channels,  it  appears  that  a  1- 
bit  combined  control/data  stream  is  all  that  can  be  supported  for  the 
hypercube  communications.  If  we  decrease  the  number  of  processors 
per  die  to  four  (and  possibly  odd  more  memory),  we  can  use  separate 
wires  for  control  and  data  but  the  wires  will  be  longer. 

Note  that  in  both  cases  the  module  pin-out  is  the  limiting  factor 
for  channel  width,  rather  than  the  chip  pin-out.  If  more  off-module 
I/O  pins  arc  available,  things  will  look  better,  but  there  will  still  be 
around  a  5-to-l  ratio  of  the  number  of  required  off-module  channels 
in  the  hypercube  as  compared  to  the  grid.  As  mentioned  before,  the 
average  interconnect  length  for  the  grid  will  be  much  shorter  than  that 
for  the  hypercube.  Therefore,  the  grid  offers  shorter  (i.e.,  faster)  and 
wider  communication  paths  than  the  hypercube  when  implemented  in 
projected  near-future  technology. 

5  Beyond  Topology 

As  the  previous  example  indicates,  the  electrical  and  physical  character¬ 
istics  of  the  circuit  packaging  in  a  system  may  dictate  the  scheme  used 
to  wire  the  nodes  together.  In  addition,  the  communications  protocol, 
that  is,  the  actual  signalling  on  the  links  are  an  important  component  of 
achievable  performance.  There  are  many  relevant  details — for  example: 

•  Dynamic  routing,  selecting  available  links  as  needed,  is  useful  in 
balancing  load  and  thus  allows  more  of  communication  resources 
of  the  system  to  be  well  used  throughout  a  computation. 

•  Cut-through  routing,  making  a  routing  decision  on  the  fly  as  a 
packet  is  received,  reduces  buffer  requirements  in  the  system  and 
minimizes  latency  experienced  in  network  transit. 

•  Local  flow  control,  signalling  transmission  delays  back  to  the 
source  based  on  local  blockage  information,  together  with  sin¬ 
gle  “word”  buffering  and  transmission  validation  at  each  network 
input  and  output  port  allows  the  source  to  complete  a  validated 
transmission  in  a  time  that  does  not  depend  on  the  size  of  the 
network 

•  Point  to  point  multicast,  sending  (approximately)  the  same 
packet  to  multiple  targets  using  common  resources  to  the  largest 
degree  possihle — coupled  with  dynamic,  cut-through  routing,  flow 
control,  and  word  level  buffering  and  transmission  validation — 
provides  “virtual  busses”  precisely  as  and  when  they  are  needed. 

A  point-to-point  protocol  utilizing  these  mechanisms  is  described  in  [3]. 

6  Conclusion 

Communications  performance  T  practical  systems  depends  first  of  all 
on  available  packaging  technology  and  second  on  protocol  considera¬ 
tions.  No  topology  considered  here  has  both  scalable  cost  and  perfor¬ 
mance,  so  the  topology  chosen  must  be  in  the  context  of  the  number 
of  processors  targetted.  For  a  thousand  processors  or  so,  given  the 
assumptions  on  mid- 1990’s  technology  discussed  earlier,  the  grid  (or 
torus)  seems  an  appropriate  choice.  The  performance  of  the  grid  will 
depend  on  the  signalling  protocol  and  will  he  best  predicted  through 
application  simulations  detailed  enough  to  rclcct  design  decisions  made 
at  that  level. 
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ABSTRACT 

New  reasoning  techniques  for  dealing  with  uncertainty  in  Expert 
Systems  have  been  embedded  in  RUM,  a  Reasoning  with  Uncer¬ 
tainty  Module.  RUM  is  an  integrated  software  tool  based  on  a  frame 
system  (KEE)  that  is  implemented  in  an  object  oriented  language. 
RUM’s  capabilities  are  subdivided  into  three  layers:  Representa¬ 
tion,  Interference ,  and  Control.  The  Representation  layer  is  based 
on  frame-like  data  structures  that  capture  the  uncertainty  infor¬ 
mation  used  in  the  inference  layer  and  the  uncertainty  meta¬ 
information  used  in  the  control  layer.  Linguistic  probabilities  are 
used  to  describe  lower  and  "per  bounds  of  the  certainty  measure 
attached  to  a  Well  Formed  For  nula  (wff).  The  source  and  the  con¬ 
ditions  under  which  the  information  was  obtained  represent  the 
non-numerical  meta-information. 

The  Inference  layer  pro  .  •  he  uncertainty  calculi  to  perform  the 

intersection  detachment,  union  and  pooling  of  the  information. 
Five  uncertainty  calculi,  based  on  their  underlying  Triangular  norms 
(T-norms),  are  used  in  this  layer. 

The  Control  layer  uses  the  meta-information  to  select  the  appropri¬ 
ate  calculus  for  each  context  and  to  resolve  eventual  ignorance  or 
conflict  in  the  information.  This  feature  enables  the  programmer 
to  declaratively  express  the  local  (context  dependent)  meta¬ 
knowledge  that  will  substitute  the  global  assumptions  tradition¬ 
ally  used  in  uncertain  reasoning.  The  control  layer  also  provides 
t  context  mechanism  that  allows  the  system  to  focus  on  the  reKant 
portion  of  the  knowledge  base,  and  an  uncertain-belief  revision 
system  formula  (wffs)  in  an  acyclic  directed  deduction  graph. 


*  This  is  a  mod i Tied  \crsion  of  the  paper  RUM:  A  Layered  Approach  to  Reason¬ 

ing  with  Uncertainty  that  will  appear  in  the  Proceedings  of  the  Tenth  Interna¬ 
tional  Joint  Conference  on  Artificial  Intelligence  (IJC'AI  87),  Ntilano,  ItaK, 
August  1987. 

The  deselopment  of  RUM’s  underling  theory  was  supported  b\  the  Defense 
Advanced  Research  Projects  Agency  (DARPA)  under  l  SAI  Rom:  Air 
Development  Center  contract  I A 0602  S5-C-0033.  Views  and  conclusions  con¬ 
tained  in  this  paper  arc  those  of  the  authors  and  should  not  be  interpreted 
as  representing  the  official  opinion  or  policy  of  DARPA  of  the  U.S.  Government. 

The  implementation  of  RUM,  was  not  part  of  (noi  was  it  funded  by)  DARPA 
contract  F30602-85-C  0033.  this  description  is  included  only  for  the  purpose 
of  illustrating  the  technology  integration  and  transition  efforts  that  GL .  within 
the  spirit  of  the  Strategic  Computing  Initiative,  has  undertaken  outside  ol  its 
contract  u  i  a  I  obligations. 


1.  Introduction:  Reasoning  with  Uncertainty  in  Expert  Sys¬ 
tems 

The  trend  followed  by  most  approaches  for  reasoning  with 
uncertainty  has  shown  an  almost  complete  disregard  for  the 
fundamental  issues  of  automated  reasoning,  such  as  the 
proper  representation  of  information  and  meta-information, 
the  allowable  inference  paradigms  suitable  for  the  representa¬ 
tion,  and  the  efficient  control  of  such  inferences  in  an  explic¬ 
itly  programmable  form.  The  majority  of  the  approaches  to 
reasoning  with  uncertainty  do  not  properly  cover  these  is¬ 
sues.  Some  approaches  lack  expressiveness  in  their  represen¬ 
tation  paradigm.  Other  approaches  require  unrealistic  as¬ 
sumptions  to  provide  uniform  combining  rules  defining  ihe 
plausible  inferences.  Most  approaches  do  not  even  recognize 
the  need  for  having  an  explicit  control  of  the  inferences. 

Specifically,  the  non-numerical  approaches  [Cohen 
1983a, 83b;  Doyle  1983],  are  inadequate  to  represent  and 
summarize  measures  of  uncertainty.  The  numerical  ap¬ 
proaches  generally  tend  to  impose  some  restrictions  upon  the 
type  and  structure  of  the  information  (e.g.,  mutual  exclusive¬ 
ness  of  hypotheses,  conditional  independence  of  evidence). 
Most  numerical  approaches  represent  uncertainty  as  a  precise 
quantity  (scalar  or  interval)  on  a  given  scale.  They  require 
the  user  or  expert  to  provide  a  precise  yet  consistent  numerical 
assessment  of  the  uncertainty  of  the  atomic  data  and  of  their 
relations.  The  output  produced  by  these  systems  is  the  result 
of  laborious  computations,  guided  by  well-defined  calculi, 
and  appears  to  be  equally  precise.  However,  given  the 
difficulty  in  consistently  eliciting  such  numerical  values  from 
the  user,  it  is  clear  that  these  models  of  uncertainty  require 
an  unrealistic  level  of  precision  that  does  not  actually 
represent  a  real  as.cssment  of  the  uncertainty. 

With  few  exceptions,  such  as  MRS  [Genescreth  1982],  the 
control  of  the  inference  process  in  most  expert  systems  has 
been  procedurally  embedded  in  the  inference  engine,  thus 
preventing  any  opportunistic  and  dynamic  change  in  ordering 
inferences  and  in  aggregating  uncertainty.  Usually,  the  same 
set  of  aggregation  operators  (i.e.,  the  same  uncertainty  cal¬ 
culus)  is  selected  a  priori  and  is  used  uniformly  for  any  infer¬ 
ence  made  by  the  expert  system.  In  the  few  numerical  ap¬ 
proaches  where  conflicting  information  is  detected  [Shafer 
1976],  conflict  handling  is  done  in  the  inference  layer,  where 
the  conflict  resolution  procedure  is  embedded  in  the  same 
combining  rules.  This  procedure  consists  of  removing  the 
conflicting  part  of  the  information.  The  non-conflicting  por¬ 
tion  is  then  normalized  and  propagated  as  if  the  conflict  nev¬ 
er  existed. 


This  lack  of  awareness  for  the  fundamental  issues  of  au¬ 
tomated  reasoning  has  been  the  driving  force  for  compiling  a 
list  of  requirements  (desiderata)  that  each  reasoning  system 
handling  uncertain  information  should  satisfy.  Following  the 
typical  structure  of  automated  reasoning  techniques,  the  list 
of  requirements  has  been  organized  in  three  layers:  represen¬ 
tation,  inference,  and  control.  The  extension  of  this  explicit 
layered  separation  from  crap- reasoning  systems  to  uncertain- 
reasoning  systems  is  a  natural  step  leading  to  a  better  integra¬ 
tion  of  the  management  of  uncertainty  with  the  various  tech¬ 
niques  for  automated  reasoning. 

An  in-depth  treatment  of  the  layered  desiderata  can  be  found 
in  an  earlier  paper  (Bonissone  1986).  In  this  article  we  de¬ 
scribe  the  theory,  design,  and  implementation  of  RUM,  a 
Reasoning  with  Uncertainty  Module  whose  layered  architec¬ 
ture  reflects  the  requirements  described  in  the  desiderata.  In 
the  next  two  sections  we  will  summarize  RUM’s  underlying 
theory  and  design,  with  a  particular  focus  on  its  control  layer. 
In  the  last  section  we  will  discuss  the  conclusions  of  this 
work. 


For  compactness  of  notation,  we  will  denote  the  meaning  of 
the  term  set  element  L,  as  the  list  (an  bn  a,  0,).  RUM  pro¬ 
vides  the  user  with  four  different  term  sets  that  can  be  used 
to  define  the  granularity  desired  in  the  subjective  assessment 
of  probability.  The  four  term  sets  contain  five,  seven,  nine, 
and  thirteen  elements,  respectively.  The  following  table  il¬ 
lustrates  one  of  the  term  sets,  the  nine  element  L-nine. 

Index  Symbol  Meaning 

1  impossible  (0  0  0  0) 

2  extremely_unlikely  (.01  .02  .01  .05) 

3  very_low_cltance  (.1  .18  .06  .05) 

4  smaU_chance  (.22  .36  .05  .06) 

5  it_may  (.41  .58  .09  .07) 

6  meaningful_cliance  (.63  .80 .05  .06) 

7  mostjikely  (.78  .92  .06  .05) 

8  extreme!y_likely  (.98  99  .05  .01) 

9  certain  (1  1  0  0) 

TABLE  1:  The  Nine  Element  Term  Set  L-nine 


2.  RUM’s  Underlying  Theory 

Preliminary  theoretical  results  were  presented  in  two  previ¬ 
ous  publications  [Bonissone  1985;  86).  This  section  summar¬ 
izes  some  of  those  results  and  provides  a  unified  framework 
for  their  interpretation  and  use  in  RUM’s  architecture.  A 
philosophical  motivation  for  the  RUM’s  three  layer  organiza¬ 
tion  can  also  be  found  in  [Bonissone  1987a). 

2.1  Term  Sets  of  Linguistic  Probabilities 


In  expert  system  applications,  users  and  experts  must  frequently  pro¬ 
vide  subjective  assessments  of  probability.  Due  to  the  difficulty  of 
eliciting  precise  and  consistent  numerical  certainty  values,  we  have 
suggested  the  use  of  term  sets  of  linguistic  probability.  Each  term 
sci  determines  the  finest  level  of  specificity  (i.e,  the  granularity )  of 
the  measure  of  certainty  that  the  user/expert  can  consistently  provide 

A  term  set  of  linguistic  probabilities  is  the  set  of  symbols  L 
=  [L,,  L2,  ....  The  meaning  of  each  term  L ,  €  L  is 
represented  by  a  fuzzy  number  on  the  [0,1]  interval.  The 
fuzzy  number’s  membership  distribution,  pL(x),  is  defined 
as  the  mapping: 

M/,( a):  [0,1]  -  [0,1]  for  all  a  €  [0,1], 


A  computationally  more  efficient  way  to  characterize  a  fuzzy 
number  is  to  use  a  parametric  representation  of  its  member¬ 
ship  function  This  parametric  representation  [Bonissone  80; 
85]  is  achieved  by  the  4-tuplc  (an  bn  a,  f},).  The  first  two 
parameters  indicate  the  interval  in  which  the  membership 
value  is  1  0;  the  third  and  fourth  parameters  indicate  the  left 
and  right  wiihh  of  the  distribution.  Linear  functions  are  used 
to  define  the  slopes.  Therefore,  the  membership  function 
Pi  (a)  is  defined  as:  , 


0 


( — )  (  v— 0,+a,) 
a 


if  .v  <  (a— a,) 
if  v  6  [(a— a,),  a,] 


PI  (  x) 


(-J-)(/-,+/3,a) 

Hi 

0 


if  a  €  [a„  b, ] 
if  a  6  [(*,,  (6,-*-/3,)] 
if  a  >  (/>,+/ 3,) 


RUM’s  representation  layer  allows  the  user  to  characterize 
the  lower  and  upper  bounds  of  the  certainty  of  a  given  fact 
by  using  elements  of  a  selected  term  set. 

2.2  T-norms:  Definitions  and  Equivalence  Classes 

Triangular  norms  (T-norms)  and  Triangular  conorms  (T- 
conorms)  are  the  most  general  families  of  binary  functions 
that  satisfy  the  requirements  of  the  conjunction  and  disjunc¬ 
tion  operators,  respectively  [Bonissone  1985).  A  T-norm  is 
defined  as  a  mapping  T:  [0, 1  ] 2  — ■ •  [0,1)  which  is  monotonic, 
commutative  and  p.~sociative.  The  boundary  conditions  of  a 
T-norm  (i.e.,  the  evaluation  of  any  T-norm  at  the  extremes 
of  the  [0, 1  )x [0, 1  ]  unit  square)  satisfy  the  truth  tables  of  the 
logical  AND  operator.  The  T-conorms  are  defined  in  terms 
of  the  T-norms  and  a  negation  operator,  by  using  a  generali¬ 
zation  of  DeMorgan’s  duality.  Thus,  for  a  suitable  negation 
operator,  such  as  N(a)  =  l-a,  the  T-conorm  S(a,b)  is  defined 
as: 


S(a.b)  =  N(  T  (N(a),  N(b)  )) 

In  a  previous  paper  iBonissone  1985],  six  parametrized  fami¬ 
lies  of  T-norms  and  dual  T-conorms  were  discussed  and  ana¬ 
lyzed.  Of  the  six  parametrized  families,  one  family  was 
selected  due  to  its  complete  coverage  of  the  T-norm  space 
and  its  numerical  stability.  This  family,  originally  defined  by 
Schwcizer  &  Sklar  iSehweizer  1963],  is  denoted  by 
TSl  ( a ,  b ,  p),  where  p  is  the  parameter  that  spans  the  space 
of  T-norms.  More  specifically: 
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Its  corresponding  T-eonorm,  denoted  by  SSc  ( a,b,p )  is  defined  as: 

Ss,.(a,b,p)  =  1  73sc(l  ~a,  1  -bj>) 

We  have  seen  that  the  use  of  term  sets  determines  the  granularity 
with  which  the  input  eertaintly  is  described.  This  granularity  limits 
the  ability  to  differentiate  between  two  similar  ealeuli;  the  numerical 
results  obtained  by  using  two  ealeuli  whose  underlying  T-norms  are 
very  close  in  the  T-norm  spaee  will  fall  within  the  same  granule  in 
a  given  term  set.  Therefore,  only  a  finite,  small  subset  of  the  infinite 
number  of  ealeuli  that  ean  be  generated  from  the  parametrized  T- 
norm  family  produces  notably  different  results.  The  number  of  eal¬ 
euli  to  be  considered  is  a  function  of  the  uncertainty  granularity. 

This  result  has  been  confirmed  by  an  experiment  [Bonissone  1985] 
where  eleven  different  ealeuli  of  uncertainty,  represented  by  their  cor¬ 
responding  T-norms,  were  analyzed.  To  generate  the  eleven  T-norms, 
the  parameter  p  in  Sehweizer’s  family  was  given  the  following  values: 
-1,  -0.8,  -0.5,  -0.3,  0  (in  the  limit),  0.5,  1,  2,  5,  8,  and  oo  (in  the  limit). 

The  experiment  showed  that  five  equivalence  classes  were  needed  to 
represent  (or  reasonably  approximate)  any  T-norm.  The  correspond¬ 
ing  five  uncertainty  ealeuli  were  defined  by  the  eommon  negation 
'"X'rator  N(a)  =  \-u  and  the  DeMorgan  pair  (ff(a,b,p),  SSl(a,b,p)  for 
e  following  values  of  p: 

p  1  7~, (<v,/z)  =  max  (0,  tf  +  /)-l) 

Si(tf,/r)  =  min  (1,  a+b) 

p  =  -0.5  fffa.h,  -0.5)  -  ma.x(0,o"'  -t-tf’5  - 1' 

Ssv(aA  0.5)  =  1-  max(0,|(l  of'+O  />),,5-l)); 

p  —  0  Tfa.b)  =  ab  Sfa,b)  =  a  +  b  -  ah 

p  =  1  TSi(a,b,\)  -  max(0//  '-tb  1  -1)  1 

Ss,(£/,W)  =  1-  ma.x(0,[(l  a)  '+(1-/;)  1  1])  1 

p  —  oo  Tx  =  min(a,b )  Sx(a,b)  =  maxia.b) 

RUM’s  inference  layer  provides  the  user  with  a  selection  of  the  five 
T-norm  based  ealeuli  described  above.  In  the  inference  layer,  they  are 
referred  to  as  7,,  T,  „  77,  77  <,  Tx,  respectively. 

3.  Design  of  RUM’s  Layered  Architecture 

RUM’s  architecture  is  based  on  three  layers:  representaiion,  inference, 
and  control.  In  the  first  layer  (the  representation  layer)  we  describe 
the  structure  required  to  capture  information  used  itr  the  inference 
layer  and  meta-information  used  in  the  control  layer.  In  this  struc¬ 
ture,  linguistic  probabilities  are  used  to  describe  the  lower  and  upper 
bounds  of  the  certainty  measure  associated  with  the  Well  Formed 
Formula  (wff).  Various  term  sets  of  linguistic  probabilities  (with  fuzzy- 
valued  semantics)  provide  different  granularities  oi 'the certainty  mea¬ 
sure.  Non-numerieal  meta-information,  describing  the  source  and  the 
conditions  under  which  the  information  was  obtained,  U  also 
represented  in  this  layer. 

In  the  second  layer  (the  inference  layer)  we  define  five  uncertainty 
calculi  based  on  their  underlying  Triangular  norms  (T-norms).  Any 
operation  required  by  an  uncertainty  calculus  can  be  expressed  in 
terms  of  its  T-norm  and  a  negation  operator.  From  past  experience, 
it  was  noted  that  T  nornr  based  ealeuli  have  various  computational 
advantages:  they  are  inilli-fwwtional ,  commutative,  and  associative. 
Therefore,  if  numerical  computations  to  evaluate  T-nornt  based  ex¬ 
pressions  are  carried  out  at  run-time,  the  above  properties  ensure  that 
any  result  ean  be  directly  computed  from  the  individual  value  of  each 
argument;  that  the  result  is  independent  from  the  order  of  the  argu¬ 
ments;  and  that  for  more  than  two  arguments,  the  evaluation  of  T- 


norm  expressions  ean  be  done  recursively  (alternatively,  the  evalua¬ 
tion  ean  be  decomposed  by  subdividing  the  arguments  into  sub¬ 
groups,  performing  eaeh  local  evaluation  independently,  and  aggregat¬ 
ing  the  partial  results). 

In  the  third  layer  (the  eontrol  layer)  we  define  the  functions 
required  to  seleet  the  ealeulus  appropriate  for  eaeh  eontext 
and  to  resolve  eventual  ignoranee  or  eonfliet  in  the  informa¬ 
tion.  These  functions  rely  on  local  (i.e.,  eontext-dependeni) 
knowledge  about  the  information  (meta-knowledge).  The 
seope  of  the  ealeulus  selection  and  ignoranee/eonfliet  resolu¬ 
tion  is  iiu.if.J  to  the  eontext  (knowledge  base  subset)  for 
whieh  the  meta-knowledge  is  available.  Figure  1  illustrates 
RUM’s  arehiteeture.  The  following  sections  describe  RUM’s 
functions  attached  to  eaeh  of  the  three  layers. 


3.1  Representation:  the  Wff  System  and  the  Rule  Language 

The  representation  layer  is  based  on  frame-like  data  struc¬ 
tures  that  capture  the  uncertainty  information  used  in  the 
inference  layer  and  the  uncertainly  meta-information  used  in 
the  control  layer. 

3.1.1  RUM’s  Wff  System 

RUM’s  Wff  System  modifies  KEE’s  representation  of  a  wff 
(well-formed  formula).  RUM’s  wff  is  the  pair  [<unii> 
<slot>],  whieh  is  the  description  of  a  variable  in  the  prob¬ 
lem  domain.  For  eaeh  w/fa  corresponding  uncertainty  unit  is 
ereated.  The  unit  contains  a  list  of  the  values  that  were  con¬ 
sidered  for  the  wff.  For  eaeh  value  the  unit  maintains  its 
certainty’s  lower  and  upper  bo  nuts,  an  ignorance  measure,  a 
consistency  measure,  and  the  evidence  source. 

Figure  2  illustrates  an  example  of  an  uncertainty  unit  at¬ 
tached  to  a  wff.  The  wff  is  the  variable  \Platform-439  Classs- 
name I.  In  the  uncertainty  unit,  under  the  slot  VALUES,  we 
can  see  the  possible  values  which  were  considered  hy  the  sys¬ 
tem  and  their  corresponding  certainty  hounds.  The  uncer¬ 
tainty  unit  also  maintains  a  reeord  of  the  rule  instances  which 
were  Fired  to  derive  such  values  (lor  inferred  wff,.  this  logical 
support  represents  the  evidence  source). 

RUM’s  WIT  System  allows  the  user  It)  express  arbitrary  un¬ 
certainty  granularity  by  providing  the  flexibility  to  mix  pre¬ 
cise  and  imprecise  measures  of  certainty  in  defining  the  input 
certainty  (points,  intervals,  fuzzy  numhers/intervals,  linguis¬ 
tic  values)  and  the  rule  strengths  (categorical  and  plausihle 
IF/1FFL  Various  term  sets  of  linguistic  probabilities  with 


IH  (Output)  The  fLAnoty*-4  J9-LH  ASS  NAMt  Unit  in  MSMI-UNC  Knowlrdgi-  B««n 


Own  ilott  CDNSTnAMT3JM.ES  from  PLATFORM -4  3S -CLASS  NAME 
Inhtritanct:  OVERRIDE  .VALUES 

V  a  lots:  FISHING  .BOAT  JOS  JO-OK- 300 -TRACK- J  In  M3MT, 

FISHWG.fl0AT.P03 JD-OK -3 Q0 -TRACK -0  In  MSMT 

Own  Hot  OEPENOENTJMES  from  PLATFORM -4 3 S-CL ASS NFME 
Inheritance:  OVERRKJC.  VALUES 
ValutClass '  GENERIC  JtULEJJNIT  in  HARDWARE 
Commtnt:  «noth«r  ilit  of  rulw. 

V a l04J.  MERCHANT  TYPEJJNK-1000-PLATFORM-43S  In  MSMT, 
FISHtNG.TYPEJJNK- 10 10-PLATFQRM-43S  In  MSMT 

Own  slot:  OSJMFS  from  PLATFORM -4  3  S-CLASS  NAME 
Inheritance'  OVERRIDE  .VALUES 
VdttClass:  QENEMCJME.UNIT  in  HARDWARE 
Avunits:  AV  JAO  in  HAROWARE 

Commtnt:  Rules  to  be  combined  tnlnj  the  Dempster -tehee Ter  conorm. 
Values  UNKNOWN 

own  Hot  FLAB  from  PLATFORM-4  3S-CLASSNAME 
Inheritance  OVERWOE  .VALUES 

Avunits:  AV.FLAO  In  HAROWAAE,  AV  ALERT  In  HAROWARE 

Cardinality  .Mini  1 
Cardinality  Max:  1 
Commtnt:  Good  or  X. 

Values:  GOOD 

Own  Slot  NECESSITY  from  PLATFORM-  4  3  S -CL  ASS  NAME 
Inheritance:  OVERDUE  .VALUES 
Armitr:  AV POLLUTE  In  HAROWARE 
Cardinality M in:  I 
Cardinality. Max.  1 
Comment.'  Minimum  support  for  a  wff. 

V  a  lots  (0.(124104  0.1165(0  00016)137  0.07143130) 

Own  slot  PLAUSIBILITY  from  PLATFORM -4 30 -CLASS NAME 
Inheritance'  OVERHUE  .VALUES 
Avunits:  AVJ*OLLUTE  In  HAROWARE 

Cardinality  Min:  | 

Cardinality  Max.  I 

Comment:  Maximum  support  fbr  e  wff. 

Vo  lues.1  1 

Own  Slot  S1JMES  from  PLATFORM-4  3S-CL  ASS  .NAME 
Inheritance :  OVERRIDE, VALUES 
V a  tutelar s  OENERtCJKJLEJJMT  in  HAROWARE 
Avunits:  AVAAQ  In  HARDWARE 
Commtnt:  Rules  that  are  to  be  dlsjuncted  uslnj  SI. 

Values:  UNKNOWN 

Own  slot.  S  16. RULES  from  PL ATFORM-4 30 -CLASS NAME 
Inhtritanct:  OVERDUE. VALUES 


(Own  slot  S SAULES  from  PLATFORM -43S -CLASS AM  ME 
Inhtritanct:  OVERRUE.VALUES 
VeleeCteSs:  OENEHIC  AUt  ELINIT  In  HARDWARE 
Avmits:  AV  JAO  In  HARDWARE 
Commtnt  Rules  to  be  dlsjuncted  uslns  SL 
V  tints:  SIN  ACS  JD -CLOSE  AOPJJP- BOO-TRACK-3  In  MSMT, 

MERCHANT.  NEQ  JO -MANEUVERS -2 10 -PLATFORM-4  SS  In  MSMT, 
MERCHANT  -NE0  JO -DISTANT  AOP  JJP-22  O-TRACK-3  In  MSMT, 

MERCHANT  N*0  JO-«STANTJ>OPJJP-220-TRACK-0  In  MSMT, 

SUB  .POSJD -CLOSE  .POP  JJP-600-TRACK-0  In  MSMT, 

SUB  AOS  JO-SONAR -680-TRACK -3  In  MSMT, 

SUB  AOS  JD -SONAR- 6  60 -TRACK -0  In  MSMT 

Own  Slot  S26JM.ES  from  PLATFORM- 4  JS -CLASS NAME 
Inhtritanct:  OVERRUE.VALUES 
VelveClOSS  GENERIC  AULEDNIT  in  HAROWARE 
Avmits:  AV  JAO  in  HAROWARE 
Commtnt ■  Rules  w  l«  dlsjuncted  uslns  92..S. 

Valuta:  FtSF4NG.BO.4T NEQ  JO *4J3EJIADAR- 3 30-TRACK -0  in  MSMT, 
FtSF4NG-BOAT.NI  QJC-TDO.F  AST-3 10 -TRACK-3  in  MSMT, 

MERCHANT  AOS  JD-OK- 100-TRACK-0  In  MSMT, 

MERCHANT AOS.IO-OK- 100-TRACK-3  In  MSMT, 

MERCHANT  J4EQJO-TDO ALOW-  140-TRACK-3  In  MSMT, 

MERCHANT 2**0  JD-TOO ALOW- 140 -TRACK- 0  In  MSMT, 

ME RCHNEQJU-TOO .SMALL-2 00-TRACK-S  In  MSMT, 

MERCH  NE0  JO-TOO  JSMALL-2  00 -TRACK -0  In  MSMT , 

FtSHMQROAT NEQ JO-TOOAAST-3 1 0-TRACK -0  In  MSMT, 

FtSHMQ  JQAT NE0 JD-USE J1AOAR-330-TRACK-3  in  MSMT 

Own  Slot  ai  SAULES  from  PLATFORM -4 3 S -CLASS NAME 
Inhtritanct:  OVERRIOE  VALUES 
Velvet  lass:  GENERIC  JtULE  UNIT  in  HAROWARE 
i  van  Its:  AV  JAO  In  HAROWARE 
Comment.-  Rules  to  be  dlsjuncted  ustns  S3. 

V tints  F1SF4N0 JOATJUEQJO-TOOAAR-340-TPU.UK-O  In  MSMT, 

FISHING  .BO  AT  NEO  JO -TOO  AAR -340 -TRACK -3  In  MSMT, 

FISHINQ  J0AT NE0  JD-TOO.IHQ-320-TRACK-3  In  MSMT, 

ME  ROMAN  TNEO  JD-OOOOE  .STATIC  JENSOR-260-PLATFORM-430  In  MSMT 
MERCHANT NEQ  JO-BAO.WEATHER-260-PLATF0RM-430  In  MSMT, 
FISF4NG  JOATNEQ  JO-TOOJMQ-320-TRACK-O  In  MSMT 

Own  slot  VALUE  from  PLATFORM -4 30 -CL ASSNAME 
Inhtritanct:  OVERRIDE. VALUES 
Comment,  Vel*  s  of  slot 

Valuer:  («»'  jMARINE  MERCHANT  FISHING. BO  AT), 

(UO  0  0  0)  (0.19191191  0.37  0.04  0.03)) 

((0.6)20104  0.01(361  0.00161137  0.07043131)  (1  I  0  0))  h 
((0  0  0  0)  (0.0110011 1 3  0.031213614  0.011002034  0.09 162^35))) 


Figure  2.  Uncertainty  Unit  Associated  with  wff  IPIatform-439  Dass-namel 


fuzzy-valued  semantics  [Beyth-Marom  1982,  Bonissone 
1985]  provide  a  selection  of  input  granularity.  The  values  of 
the  terms  can  be  used  as  uefault  values  or  can  be  modified 
by  the  «er. 

3.1.2  'H'JM’s  Rule  System:  The  Rule  Language 

RDM’s  Rule  System  replaces  KEF  Rule  System-3  capabilities 
by  incorporating  uncertainly  information  in  the  inference 
scheme.  The  uncertain  information  is  described  in  the  un¬ 
certainty  units  of  the  wff. s,  represented  in  RUM’s  Wff  Sys¬ 
tem,  and  in  the  degrees  of  sufficiency  and  necessity  attached 
to  each  rule."  The  degree  of  sufficiency  denotes  the  extent  to 
which  one  should  believe  in  the  rule  conclusion,  if  the  rule 
premise  is  satisfied  The  degree  of  necessity  indicates  the 
confidence  with  which  one  can  negate  the  conclusion,  if  the 
premise  fails. 

A  rule  is  internally  represented  by  a  frame  with  several  slots. 
These  slots  include  the  name  of  the  rule;  the  lists  of  con¬ 
texts,  premises,  and  conclusions;  the  rule’s  sufficiency  and 
necessity,  and  the  t-norni  to  be  used  for  aggregation  All 

*  It  is  imporumt  to  note  lhai  1  tic  inference  symbol  -  in  the  production 
rule  .1-'  I!  is  interpreted  as  a  (weak)  material  implication  operator  in 
multiple-valued  logics  The  value  s  is  the  lower  bound  of  the  degree  of 
sufficiency  of  the  implication.  This  is  in  contrast  wiih  the  interpreta- 
lion  of  cotnlliinnmx.  i.e.,  s  =  P([i  A).  The  symbol  —  in  Die  produc¬ 
tion  rule  .1  —  '  "  B  is  interpreted  as  a  (weak)  io.eical  equivalence 

operator  in  multiple-valued  logics,  in  which  v  and  n  are  the  lower 
bounds  of  sufficiency  and  necessity,  respectively  This  (weak)  logical 
equivalence  is  an  il-antl~tmlv-if  (IFF)  rule,  which  can  be  decomposed 
into  the  two  rules  .-I  — '  II  and  H  -t  (equivalent  to  -’/I  —  '  --/b 
RUM's  rules  are  of  the  type  C  —  (  I  —  H),  where  C  indicates 

the  contest  oi  the  rule  (sec  section  2.3.3)  and  —  represents  ihe  strong 
material  implication. 


slots  (except  the  name,  premises,  and  consequences)  have 
default  values.  The  contexts,  premises,  and  conclusions  can 
comprise  values,  variables,  RUM  predicates  and  arbitrary 
LISP  functions.  Rules  with  unbound  variables  are  instantiat¬ 
ed  with  the  necessary  environment  to  produce  rule  instances. 
Figure  3  illustrates  an  example  of  the  instantiation  of  Rule- 
550,  internally  represented  as  a  frame.  Rule-550  defines  a 
relationship  between  the  parameters  obtained  from  a  sensor 
repoit  (sonar)  and  the  value  Submarine  for  the  wff  [Platform- 
Clas:  pc ime].  The  same  rule  in  its  English  and  Lisp  versions 
(the  latter  being  the  form  in  which  the  rule  is  originally  writ¬ 
ten)  is  described  in  section  3,1  of  the  second  paper  included 
in  Part  1 

The  T-nurm  specified  with  each  rule  is  used  to  aggregate  the 
certainties  of  the  rule  premises  and  to  perform  detachment 
(which  computes  the  certainly  of  the  conclusion  given  the 
sufficiency  and  necessity  of  the  rule)  It  defaults  to  Tj, 
which  is  the  MIN  function  The  associated  T-conorm  is  used 
to  aggregate  the  certainties  of  identical  conclusions  inlerred 
by  multiple  rule  instances  derived  from  ihe  same  rule.  These 
arc  often  subsumplivc,  and  the  value  defaults  to  S',,  the 
MAX  function.  Finally,  each  separate  consequence  of  a  rule 
has  a  specified  T-conorm  that  will  be  used  to  aggregate  the 
consequence  with  identical  consequences  derived  from 
different  rules,  (i.e. ,  multiple  assignments  of  the  same  value 
to  the  wfT),  The  negation  operator  causes  the  wIT  to  be  as¬ 
signed  the  complemented  value. 


If  ii  «//  Inis  u  value  A  wall  an  It  Ihe  certainty  interval  attached  to  a 
value  A  is  |L(AI,  IKAll.  its  complemented  value.  -A.  has  a  certainly 
interval  defined  by  { 1  -t  (At.  1  - L  ( -\  >  1 
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Own  Hot  RULE  .NECESSITY  from  SUB  JOS  JO -SONAB- 5  50-1  RACK -3 

Own  llot:  CONST  put  Nets  from  SUBJO3JD-SONAR-550-TRACK-3 

Inhtritanct:  OVE BROS. VALUES 

Inhtritsnct:  UNIQUE. VALUES 

Avmits,  A V  JIAO  In  HAROWABE 

Jronifx:  AV.CONSq.UPOATE  In  HAHOWARE  1^ 

Cardinality  Min:  1 

Co*mtnt:  Results. 

Cardinality  Max:  1 

Taints:  ((GET. VALUE  TRACK--  Ti  -TEORM)  CLASS.NAME  SUBMARINE  S2.RULES) 

Own  slot:  CONTEXTS  from  SUBJC.iJD  'ONAR-650 -TRACK -3 

Commtnt:  Lower  bound  of  Q  ■>  P. 

Valats:  XT  MAT 

Inhtritanct:  UNIQUE  .VALUES 

Own  slot  MJLEJUTflOENCY  from  SUB JOSJO -SONAR -5 50 -TRACK -3 

Avmits:  AVJAO  in  HA  BOW  ARE 

Inhtritanct .  OVERRJOE. VALUES 

Ct.mmtrtt:  List  of  context*  to  trigger  the  rule 

Avmits:  AV  JAO  in  HAROWABE 

Valuts  (IS-IN -CLASS?  -TRACK-3  -SOURCE  (SONAR  LOTTA)) 

Cardinality  Min:  l 

Cardinality  Max:  1 

Own  Slot;  FLAQ  from  SUB  JOS  .  O- SON  A.  eSO-TknCK-S 

Commtnt:  Lower  bound  of  P  ■>  Q. 

Inhtritanct-  OVERSHOE  .VALLES 

Avmits-  AVJLAO  In  H  ABO  W  ABE,  AV  .ALERT  In  HA  HO  WARE 

Valats:  EXTREMELY J,1KELT 

Cardinality  Min:  1 

Own  slot  SUB. TASK  from  OENERH2 JMJLE  JJNI T 

CardinalityMax:  1 

Inhtritanct:  METHOO 

Commtnt;  Good  or  X. 

Va lotC toss  METHOO 

Valuts  NA 

Own  slot  NECESSITY  from  SUB  JOS  JO-SONAR -660-TRACK- 3 

Commtnt:  LISP  teak  performed  when  the  sub  threshold  is  passed 

Va  lots:  HARDWARE  XJENER1C.RULE  .UNIT SUB. TASK  (method 

Inhtritanct-  UNIQUE  .VALUES 

Own  Slot  SUB. THRESHOLD  from  GENERIC. RULE  UNIT 

Armits  AVJMJLC.EVAL  In  HARO  WARE 

Inhtritanct :  OVERRtOE.  VALUES 

Cardinality  Min:  1 

Cardinality  Min:  l 

Cardinality-Max.  l 

Cardinality  Max:  1 

Commtnt:  Minimum  proof. 

Commtnt:  Threshold  to  trigger  the  sub.tesk, 

V a  tots:  UNKNOWN 

Valuts:  0 

Own  Slot:  PLAUSRWJTY  from  SUBJOS  4) -SONAR -6 50 -TRACK -3 

Own  slot  SUPER. TASK  from  OENEMC  JMJLE  UNIT 

Inhtritanct:  UMQUE.VALUES 

Inhtritonct:  METHOO 

Avmits.  AV  JKAf  XVAL  In  HAROWABE 

ValutClass:  METHOO 

Cardinal  ity.M  in:  1 

Commtnt:  LISP  tesk  to  perform  should  the  necessity  pass  the  super. threshold. 

Cardinality. Max:  1 

Commtnt:  Maximum  proof. 

Valuts:  HARDW ARE X5ENER1C.RULE. UNIT SUPER. TASK Imethod 

Valuts:  UNKNOWN 

Own  slot  SUPER. THRESHOLD  from  QENEMC  JMJLE  JJM< 

Inhtritanct:  DVERBtOC. VALUES 

Own  slot;  PRJ  Ml  V. SLOTS  from  SUB  -PCS  JO -SONAR  -  660 -TRACK -3 

Inhtritanct:  UNIQUE. VALUES 

Cardinality  Max:  1 

Avmits:  AV-STILTPY  In  HAIOWANC 

Commtnt :  Threshold  to  trigger  the  super  teak. 

Commtnt:  Slot*  which  tffect  this  rule. 

Valuts:  TR  ACK  -  3-L  AST  JC  PORT  in  MSMT-UNC,  TRACK -3 -SOURCE  In  MSMT-UNC 

Valuts:  I 

Own  Slot  T .NORM  rrom  SUB  JOSJO-SONAR-550-TRACK-3 

Own  slot:  PREMISES  from  SUB  JOS  JO-SONAR- 5  50-TRACK -3 

Inhtritanct:  OVERRIOE. VALUES 

Inl.tritanct  UNtQUE. VALUES 

ValutClass:  T .NORM .FAMILY  In  HAf  “ARE 

Avunits:  AV  J1AD  In  HARDWARE 

Avmits  AV.8AD  In  HARDWARE 

Commtnt:  List  of  premises. 

Cardinality. M  in:  1 

Values.  (1S-VALUE?  (GET  VALUE  TRACK -3  TASTJIEPORV)  NOISE-EMLSSIONS  TOW). 

(U-LESSP  (GET. UNCERTAIN  V ALUE  (GET. VALUE  TRACK-3  TAST .REPORT)  XLE 

VATION) 

(rU22  -20)) 

Valuts :  T3  in  HARDWARE 

Own  Slot.  TEMP.CONSQJ.IST  from  SUB  JOSJO -SONAR- 660-TRACK-3 

Own  slot  MILE  NECESSITY  from  SUB  JOSJO-SONAR-55D-TRACK-3 
tnhtrUnnrt  nVFnnmf  V»t  lift 

Inhtritanct:  OVERRIOE. VALUES 

Figure  3.  Internal  Representation  of  an  Instance  of  Rule  550 


k  3.2  Interference:  Triangular  norms  (T-norms)  Based  Calculi 

|  The  inference  layer  is  built  on  a  set  of  five  Triangular  norms 

(T-norms)  based  calculi  The  T-norms’  associativity  and 
truth  functionality  entail  problem  decomposition  and  rela¬ 
tively  inexpensive  belief  revision.  The  theory  of  T-no^ms 
has  been  covered  in  previous  articles  [Bonissone  1985;  3  - 86] . 
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3.2.1  Operations  in  a  T-norni  Based  Calculus 

For  each  calculus,  four  operations  are  defined  in  LUM's 
Rule  System:  premise  evaluation,  conclusion  detachment,  conclu¬ 
sion  aggregation,  and  source  consensus.  Each  operation  in  a 
calculus  an  be  completely  defined  by  a  Triangular  norm 
T(,.),  and  a  negation  operator  N(.)  (just  as  in  classical  logic, 
any  boolean  expression  can  be  rewritten  in  terms  of  an  inter¬ 
section  and  complementation  ope'ator).  The  four  operations 
are  defined  as  follows: 

Premise  evaluation:  The  premise  evaluation  operation  deter¬ 
mines  the  degree  to  which  all  the  clauses  in  the  rule  premise 
have  been  satisfied  by  the  matching  wffk.  Let  h,  and  B  indi¬ 
cate  the  lower  and  upper  bounds  of  the  certainty  of  condition 
i  in  the  premise  of  a  given  rule  Then  the  premise  certainty 
range  [b,Bl  is  defined  as' 

[b,Hl  =  I  T(b\,  hj . />,„).  T(BX.  B: . B,„)  I 

Conclusion  Detachment:  The  conclusion  detachment  opera¬ 
tion  indicates  the  ce  tinty  with  which  the  conclusion  can  be 
asserted,  given  the  strength  and  appropriateness  of  the  rule. 
Let  v  and  n  he  the  lower  bounds  of  the  degree  of  sufficiency 
and  necessity  respectively,  of  the  given  rule,  and  ictl)’.#1  be 
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the  computed  premise  certainty  range.  Then  the  range  [c.C] , 
indicating  the  lower  and  upper  bound  for  the  certainty  of  the 
conclusion  inferred  by  such  rule,  is  defined  as: 

|c,q  =  [  T(s,  b ),  V(  Tin,  N(B )))  I 

The  degrees  of  sufficiency  and  necessity  respectively  indicate 
the  amount  of  certainty  with  which  the  rule  premise  implies 
its  conclusion  and  vicevcrsa.  The  sufficiency  degree  is  used 
with  modus  ponens  to  provide  a  lower  bound  of  the  conclu¬ 
sion.  The  necessity  degree  is  used  with  modus  tollens  to  ob¬ 
tain  a  lower  bound  for  the  complement  of  the  conclusion 
(which  can  be  transformed  into  an  upper  bound  for  the  con¬ 
clusion  itself). 

Conclusion  aggregation :  The  conclusion  aggregation  operation 
determines  the  consolidated  degree  to  which  the  conclusion 
is  belie'  ed  if  supported  by  more  than  one  path  in  the  rule 
deduction  graph, i  ,e. ,  by  more  than  one  rule  instance.  It  is 
also  possible  to  have  various  groups  of  deductive  paths,  i.c. 
various  sets  of  rule  instances,  all  supporting  the  same  conclu¬ 
sion.  Each  group  of  deductive  paths  can  have  a  distinct  con¬ 
clusion  aggregation  operator  associated  with  it.  Let  the 
ranges  lc,,C,]  indicate  the  certainly  lower  and  upper  bounds 
of  the  some  conclusion  inferred  by  various  rules  instances  be¬ 
longing  to  the  same  group.  Then,  lor  each  group  of  deduc- 
ti . e  paths,  the  range  |f/.D]of  the  aggregated  conclusion  is 
defined  as: 

if i.D]  (.vmvio),  V(C; . 7(  V(Cj) . V(C;  »)  I 
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RUM  distinguishes  between  rule  instances  generated 
from  the  same  rule  and  rule  instances  derived  from  different 
rules.  The  first  type  or  rule  instances  is  aggregated  first,  to 
take  into  account  the  usually  large  amount  of  redundancy 
that  such  rule  instances  entail.  The  second  set  of  rule  in¬ 
stances  is  subsequently  aggregated  taking  into  account  the 
knowledge  about  the  presence  or  lack  of  positive/negative 
correlation  that  characterizes  the  various  rules. 

Source  Consensus:  The  source  consensus  operation  reflects 
the  fusion  of  the  certainty  measures  of  the  same  evidence  A 
provided  by  different  sources.  The  evidence  can  be  an  ob¬ 
served  fact,  or  a  deduced  fact.  In  the  former  case,  the  fusion 
occurs  before  the  evidence  is  used  as  an  input  in  the  deduc¬ 
tion  process.  In  the  latter  case,  the  fusion  occurs  after  the 
evidence  has  been  aggregated  by  each  group  of  deductive 
paths.  The  source  consensus  operation  reduces  the  ig¬ 
norance  about  the  certainty  of  A,  by  producing  an  interval 
that  is  always  smaller  or  equal  to  the  smallest  interval  provid¬ 
ed  by  any  of  the  information  source.  If  there  is  an  incon¬ 
sistency  among  some  of  the  sources,  the  resulting  certainty 
intervals  will  be  disjoint,  thus  introducing  a  conflict  in  the  ag¬ 
gregated  result.  Let  [Z,,(/l),  U^A)],  [Z.2(zl),  Ui(A)\  ..., 
\Ln(A),  U„(A)\  be  the  certainty  lower  and  upper  bounds  of 
the  same  conclusion  provided  by  different  sources  of  infor¬ 
mation.  Then,  the  result  {L„lt(A),  Um,(A )],  obtained  from 
fusing  all  the  assertions  about  A,  is  given  by  taking  the  inter¬ 
section  of  the  certainty  intervals: 

[Z.,„,(/0,  U,JA)  1  =  [  Max,  L,(A ),  Min,  U,(A)\ 

3.3  Control:  Calculus  ^election,  Uncertain-Belief  Revision, 
Context  Mechanism 

3.3.1  Calculi  Selection 

As  it  was  discussed  in  the  previous  section,  RUM’s  Rule  Sys¬ 
tem  uses  a  set  of  five  T-norm  based  calculi.  The  calculus 
used  by  each  rule  instance  is  inherited  from  its  rule  subclass 
(the  rule  before  the  instantiation).  The  calculus  can  be 
modified  through  KEE’s  user  interface  or  programmatically 
(i.e.,by  an  active  value).  Class  inheritance  can  also  be  used 
to  modify  the  degree  of  sufficiency  and  necessity  of  all  ihe 
rule  members  of  the  same  class. 

The  calculi  selection  consists  of  two  assignments.  The  first 
assignment  indicate-  the  T-norm  with  which  the  premise 
evaluation  and  the  conclusion  detachment  will  be  computed. 
Such  an  assignment  is  made  for  each  rule,  and,  through  in¬ 
heritance,  is  passed  o  all  rule  instances  derived  from  the 
same  rule. 

The  second  assign:  ient  indicates  the  T-conorm  (represented 
by  its  dual  "T-norm  with  which  the  conclusion  aggregation 
will  he  computed  This  assignment  is  made  for  each  subset 
of  rule  instances  generated  from  different  rules  and  asserting 
til.  same  conclusion. 

3.3. 1.1  Rationale  for  Calculi  Selection 

The  T-norm  characteristics  will  determine  the  selection 
choices.  Foi  the  first  assignment,  the  T-norm  assigned  to 
each  rule  for  the  premise  evaluation  and  the  conclusion  de¬ 
tachment  will  be  a  function  of  the  decision  maker's  attitude 
toward  risk.  The  ordering  of  the  T-norms,  which  is  identical 
to  the  ordering  of  parameter  p  in  the  Schweizer  &  Sklar  fami¬ 


ly  of  T-norms,  reflects  the  ordering  from  a  conservative  atti¬ 
tude  (p  =  -1  or  t\)  to  a  non-conservative  one  (p  —  oo  or  T,). 
From  the  definition  of  the  calculi  operations,  we  can  see  that 
7j  will  generate  the  smallest  premise  evaluation  and  the 
weakest  conclusion  detachment  (i.e.,  the  widest  uncertainty 
interval  attached  to  the  rule’s  conclusion).  T-norms  gen¬ 
erated  by  larger  values  of  p  will  exhibit  less  drastic  behaviors 
and  will  produce  nested  intervals  with  their  detachment 
operations.  T}  will  generate  the  largest  premise  evaluation 
and  the  strongest  conclusion  detachment  (the  smallest  cer¬ 
tainty  interval). 

For  the  second  assignment,  the  T-norm  assigned  to  the  sub¬ 
sets  of  rule  instances  (derived  from  different  rules  and  as¬ 
serting  the  same  conclusion)  will  be  a  function  of  the  lack  or 
presence  of  positive/negative  correlation  among  the  rules  in  each 
subset.  The  ordering  of  the  T-norms  reflects  the  transition 
from  the  case  of  extreme  negative  correlation,  i.e.,  mutual 
exclusiveness  (7j),  through  the  case  of  uncorrelation  (TJ, 
to  the  case  of  extreme  positive  correlation,  i.e.,  subsumption 

(7j). 

Currently,  all  calculi  assignments  are  explicitly  made  and 
modified  through  the  user  interface,  to  exercise  the  imple¬ 
mented  accessing  functions.  In  the  next  development  phase 
of  RUM  control  layer,  the  calculi  assignments  will  be  made 
by  a  set  of  selection  rules  expressing  the  meta-knowledge 
about  the  context.  These  rules  will  select  the  T-norms  that 
better  reflect  the  knowledge  engineer's  desired  attitude  to¬ 
ward  risk  and  the  perceived  amount  of  correlation  among  the 
rules  used  in  such  a  context. 

3.3.2  Uncertain-Belief  Revision 

A  daemon-based  implementation  of  the  belief  revision  of  the 
uncertain  information  is  available  in  the  control  layer  of 
RUM’s  Rule  System.  For  any  conclusion  made  by  a  rule, 
the  belief  revision  mechanism  monitors  the  changes  in  the 
certainty  measures  of  the  wffs  that  constitute  the  conclusion's 
support  or  the  changes  in  the  calculus  used  to  compute  the 
conclusion  certainly  measure.  Validity  flags  are  inexpensively 
propagated  through  the  rule  deduction  graph.  Five  types  of 
flag  values  are  used: 

Good  Guarantees  the  validity  of  the  cached  certain¬ 

ty  measure  detached  by  the  rule  instance  and 
aggregated  into  the  associated  wff. 

Bad  (level  i)  Indicates  that  the  cached  certainty  measure  detached 
by  the  rule  instance  is  no  longer  reliable,  since  the 
support  of  some  of  the  wff  s  in  the  premise  of  i  his 
rule  instance  lias  changed.  The  Ah  level  indicates 
the  correct  order  of  recompuiation. 

Inconsistent  Indicates  that  the  cached  certainty  measure 
associated  with  the  wff  is  conflicting.  The  in¬ 
consistency  can  be  removed  by  executing  a 
locally  defined  procedure  (differential  diag¬ 
nosis  type  of  experiment,  recency  of  informa¬ 
tion,  split  in  possible  words  with  subsets  of 
the  original  sources,  etc.) 

Not  Applicablelndicates  that  the  context  of  the  rule  instance 
is  no  longer  active  and  the  rule  instance  con- 


tribution  to  the  aggregated  certainty  measure 
of  the  wff  should  be  ignored. 

Ignorant  Indicates  that  the  cached  certainty  measure 
detached  by  the  rule  instance  is  too  vague  to 
be  useful.  The  default  behavior  is  to  ignore 
the  rule  instance  contribution  to  the  aggregat¬ 
ed  certainty  measure  of  the  wff.  Loeally 
defined  procedure  could  be  used  to  remove 
the  ignorance  if  so  specified. 

3.3. 2.1  An  Example  of  Using  the  Uncertain-Belief  Revi¬ 
sion 

To  provide  the  reader  with  a  better  understanding  of  the 
uncertain-belief  revision,  we  will  make  the  following  graphi¬ 
cal  analogy,  the  wffs  of  the  reasoning  system  correspond  to 
nodes  in  an  acyclic  deductive  graph;  the  inference  rules  in 
the  system  correspond  to  the  inference  gates  that  connect  the 
nodes  in  the  graph.  There  are  two  types  of  wffs:  the  observa¬ 
tions  or  assumptions,  corresponding  to  the  nodes  at  the  fron¬ 
tier  of  the  graph,  and  the  inferred  conclusions,  corresponding 
to  the  intermediate  nodes  in  the  graph.  The  first  type  of  node 
does  not  have  any  logical  support  (its  evidence  source  is  the 
observer  or  the  assumption’s  maker).  The  second  type  of 
node  has  a  logical  support  represented  by  the  set  of  rule  in¬ 
stances  that  made  that  inference.  For  this  second  type  of 
nodes,  this  logical  support  is  the  evidence  source. 

Figure  4  illustrates  a  a  portion  of  an  aeyclic  deduetive  graph. 
In  the  graph  we  can  observe  the  following  five  rules: 

Rl;  C  —  (A,B— J)  suffic.  =  s,  necess.  =  n,  calcul.  =  T: 
aggreg.  =  S: 

R2:  C  -  (D-J)  suffic.  =  Si  necess.  =  iv.  calcul.  =  T, 
aggreg.  =  S, 

R3:  (E-J)  suffic.  =  s,  necess.  =  n,  calcul.  =  T, 

aggreg.  =  S., 

R4:  H  -  (E,F,G~J)  suffic.  =  s4  necess.  =  n4  calcul.  =  T, 
aggreg.  =  S, 

R5:  (J.l-J)  suffic.  =  s4  necess.  =  n4  calcul.  =  T, 

aggreg.  =  Si 

Two  more  rules,  R6  and  R7,  are  partially  shown  in  the  san  e 
figure. 


*  The  following  nolation  is  used  in  1  he  rule  description  and  in  the  figure 

indicates  intersection,  (input  of  the  same  gate) 

..  indicates  (strong)  material  implication,  (control  line  on  the  side  of  the 

gate) 

indicates  (weak)  logical  equivalence,  i  c. .  it-and-only-if  rule  (gate) 

••s  »  indicates  the  lower  bound  of  sufficiency  of  rule  i 

••n  »  indicates  the  lower  bound  of  necessity  ot  rule  i 

■•|T  ]"  indicates  the  calculus  (T-norm)  used  b\  the  rule  to  perform  premise 

aggregation  and  conclusion  detachment. 

"S,"  Suffix  j  takes  one  of  the  following  values:  {j  1.  1.5,  2.  2.5.  3} 

indicates  ihe  calculus  (T-conorm)  used  to  perform  the  conclusion  aggic- 
gatton. 

"L,.  U3/'  Suffix  j  takes  one  of  the  following  salues:  {j  I.  1.5,  2.  2.5,  31 

indicate  the  losscr  and  upper  bounds  of  the  conclusion  detached  hont 
rule  i 


Figure  4.  Portion  of  an  Acyclic  Deductive  Graph 


In  Figure  4,  C  and  H  represent  two  eontext  descriptions  that 
enable/disable  the  activation  of  rules  Rl,  R2,  R4.  The  other 
two  rules  (R3  and  R5)  are  always  potentially  active  (regard¬ 
less  of  context).  The  figure  shows  the  case  in  which  fact  D 
has  just  changed.  This  change  causes  the  propagation  of  a 
bad-validity  flag  that  affects  the  conclusion  of  rules  R2  and 
R5  (J  and  K,  respectively).  The  numbers  attached  to  the  bad 
flag  indicate  the  order  in  which  a  reeomputation  of  the  cer¬ 
tainty  measures  must  be  performed.  Fact  H  has  also 
changed  and  its  new  value  no  longer  satisfies  the  context 
description  of  rule  R4,  thus  causing  the  not-applicable  flag  to 
be  attached  to  the  detachment  of  R4.  Fact  L  has  also 
changed,  affecting  the  validity  of  Rule  R6’s  detachment. 

3.3. 2. 2  Reasoning  under  Pressure 

The  belief  revision  system  offers  both  backward  and  forward 
processing.  A  lazy  evaluation,  running  in  depth-first,  back¬ 
ward  mode ,  recomputes  the  certainty  measures  of  the 
modified  wff  that  are  required  to  answer  a  given  query.  This 
mode  (called  reasoning  under  pressure)  is  used  when  the 
system  or  the  user  deeide  that  they  are  dealing  with  time- 
critical  tasks.  In  the  case  illustrated  in  the  previous  figure,  if 
the  value  of  wff  K  were  requested,  the  systems  would  per¬ 
form  the  following  sequence  of  tasks:  fetch  the  new  certainty 
values  of  D  (lower  and  upper  bounds);  recompute  the  de¬ 
tachment  of  rule  R2;  use  T-conorm  S2  to  evaluate  the  OR 
node  (with  Rl  and  R2’s  detachments);  ignore  R4’s  detach¬ 
ment,  treating  R3’s  detachment  as  the  only  input  to  the  OR 
node  associated  with  T-conorm  .S’,;  fuse  the  two  OR  nodes, 
defining  the  new  certainty  values  of  wff)-  recompute  the  de¬ 
tachment  of  rule  R5;  use  T-conorm  S2  to  evaluate  the  OR 
node  (with  R5  and  R7’s  detachments),  obtaining  the  new 
certainty  values  of  wff  K. 

When  time  is  not  critical,  the  system  can  use  a  breadth-first, 
forward  mode  processing  to  recompute  the  certainty  measures 
of  the  modified  i yffs,  attempting  to  restore  the  integrity  of 
the  rule  deduction  graph. In  the  case  illustrated  in  the  previ¬ 
ous  figure,  t h is  implies  an  update  of  fact  I.  and  rule  R6  (both 
of  which  were  not  considered  by  the  backward  mode,  since 
they  did  not  play  an>  role  in  determining  the  value  of  the 
proposed  query,  e.g.  wffK). 


The  sirueture  of  the  graph  can  also  change,  as  new  rule  In¬ 
stances  are  created  or  deleted,  due  to  changes  in  the  facts’ 
values,  (as  opposite  to  facts’  certainty  values).  The  deduc¬ 
tion  graph  is  updated  and  bad  flags  are  propagated 
throughout  the  network 

3.3.3  Rule  Firing  Control  via  Context  Activation 

A  user-definable  threshold  ean  be  attached  to  eaeh  rule  con¬ 
text,  either  by  local  definition  or  by  inheritance  from  a  rule 
elass.  A  tule  context  is  defined  as  a  conjunction  of  condi¬ 
tions  that  must  be  satisfied  before  the  rule  can  be  considered 
for  premise  evaluation.  Each  condition  is  described  by  a 
predicate  on  object-level  wffs  (facts  in  problem  domain),  or 
eontrol-level  n;/j$  (markers  asserted  by  meta-rules).  The  se¬ 
mantics  of  a  eontext  C  attached  to  an  inference  rule  (estab¬ 
lishing  the  weak  logical  equivalence  between  A  and  B)  is 
given  by  the  following  expression: 

C  -  (  A  —  B  ) 

where  v  and  n  indicate  the  lower  bounds  of  the  degree  of 
sufficiency  and  necessity  that  the  rule  provides;  — *  represents 
the  strong  material  implication;  -  denotes  the  weak  logical 
equivalence. 

Th>.  context  meehanisni  provides  the  following  features: 

1.  By  activating/deactivating  subsets  of  the  KB,  it  limits  the 
number  of  rules  that  will  be  considered  relevant  at  any 
given  time,  '.bus  increasing  the  overall  system  efficiency. 

2  By  only  considering  the  rules  relevant  to  a  given  situa¬ 
tion.  it  allows  the  knowledge  engineer  to  effectively  use 
the  necessary  conditions  in  the  rule’s  premise.  It  is  now 
possible  to  distinguish  between  the  failure  of  a  neeessary 
test  (described  in  the  premise)  and  the  failure  of  the 
rule’s  applicability  (traditionally  described  by  other 
clauses  in  the  same  premise  and  now  explicitly 
represented  in  the  context). 

3.  By  using  predicates  on  the  control-level  wffs,  it  provides 
the  required  programmability  for  defining  flexible  control 
strategies,  such  as  causing  sequences  of  rules  to  be  r'"i- 
cuted,  firing  default  rules,  ordering  and  handling  t< 
dependent  information,  etc. 

4  By  using  hierarchical  contexts,  it  can  be  used  as  an  or¬ 
ganizing  principle  for  the  knowledge  acquisition  task. 

4.  Remarks  and  Conclusions 

RUM’s  layered  architecture  properly  addresses  the  require¬ 
ments  defined  by  the  desiderata  [Bonissone  19861  for 
iwr"mm-reasoning  systems.  The  representation  layer  cap¬ 
tures  the  uncertain  information  about  the  iv//s  (lower  and 
upper  bounds)  used  by  the  calculi  in  the  inference  layer  to 
determine  the  uncertainty  of  the  conclusions.  The  represen¬ 
tation  layer  also  captures  the  uncertain  meta-information 
(evidence  source  or  logical  support,  measures  of  ignorance 
and  conflict)  used  by  the  belief  revision  system  and  other 
mechanisms  in  the  control  layer. 

The  inference  layer  provides  the  knowledge  engineer  with  a 
rich  selection  of  well-understood  calculi  to  properly  represent 
existing  correlations  among  rules.  Numerical  computations 


performed  in  this  layer  are  efficiently  implemented  by  using  a 
four  parameter  representation  for  the  uncertainty  bounds, 
supported  by  a  set  of  closed  form  formulae  that  implement 
the  truth  function  '1  uncertainty  calculi  [Bonissone  1985], 

The  control  layer  provides  the  explicit  selection  and 
modification  of  uncertainty  caleuli.  Its  context  activation 
mechanism  allows  the  reasoning  system  to  focus  on  the 
relevant  subsets  of  the  changing  inference  base  (the  acyclic- 
deductive  graph).  The  uncertain-belief  revision  maintains 
the  integrity  of  those  relevant  subsets,  reflecting  the  changes 
of  the  information. 

RUM’s  development  environment  provides  the  traceability 
of  wffs  and  rules  that  is  required  for  proper  KB  development 
and  refinment.  An  example  of  sueh  a  KB  development  is 
presented  in  the  next  paper,  which  describes  the  experiments 
used  to  validate  RUM  as  a  reasoning  tool  applied  to  a  naval 
situation  assessment  task. 
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ABSTRACT 

This  paper  summarizes  our  research  efforts  in  the  area  of 
Reasoning  with  Incomplete  and  Uncertain  Information,  and 
is  organized  into  three  parts  covering  reasoning  with  uncer¬ 
tainty,  reasoning  by  analogy,  and  reasoning  with  incomplete¬ 
ness. 

Part  1,  entitled  Reasoning  with  Uncertainly ,  summarizes  the 
papers  RUM:  a  Layered  Architecture  for  Reasoning  with  Uncer¬ 
tainty ,  [Bonissone,  Gans,  and  Decker,  1987]  and  Using  T- 
norm  Based  Uncertainty  Calculi  in  a  Naval  Situation  Assessment 
Application  [Bonissone  1987],  This  first  paper  describes  an 
integrated  software  tool  that  implements  the  three  layer  ar¬ 
chitecture  concept  described  in  our  previous  progress  report. 
This  software  tool  is  based  on  KEE™  (Knowledge  Engineer¬ 
ing  Environment),  an  expert  system  shell  implemented  in  an 
object  oriented  language.**  The  second  paper  illustrates  an 
application  of  RUM  in  solving  a  multi/sensor  multi  target 
problem  developed  in  LOTTA,  an  object-based  simulation 
environment. 

Part  II,  contains  entitled  Reasoning  by  Analogy  summarizes 
the  paper  A  Mathematics '  Theory  for  Diagnosis  Based  on  the 
MONAD  Concept  [Portcr87[.  This  paper  describes  the 
model-based  knowledge  representation  and  search  strategy 
used  to  form  problem  models  in  the  MONAD  system. 

Part  III.  entitled  Reasoning  with  Incomplete  Information  sum¬ 
marizes  Three  papers  An  Algebraic  Foundation  for  Truth 
Maintenance  [Brown,  Benanav,  and  Gaucas,  1987],  Logics  of 
Justified  Beliefs  [Brown,  1987],  and  A  Role  for  Assumpiinn- 

'  This  work  was  partially  supported  by  the  Defense  Advanced  Research 
Projects  Agency  (l)ARPA)  undet  tISAt'/Romc  Air  Development 
Center  contract  l;30602-85-C-003.1.  Views  and  conclusions  contained 
in  this  paper  are  Ihose  of  the  authors  and  should  not  he  interpreted  as 
represenlirg  the  official  opinion  or  policy  of  DARPA  or  the  t  S 
Government. 

"The  implementation  of  RUM,  Reasoning  with  Uncertainty  Module, 
was  not  pari  of  (nor  was  it  funded  hy)  DARPA  contract  P30b02-85-(  - 
0033.  This  description  is  included  only  for  the  purpose  of  illustrating 
the  technology  integration  and  iransilion  efforts  that  (if,  within  the 
spirit  of  the  Strategic  Computing  Initiative,  has  undertaken  outside  of 
its  contractual  obligations. 


based  and  Non-monotonic  Justifications  in  Automating  Strategic 
Threat  Analysis  [Gaucas  and  Brown,  1987],  The  first  paper 
presents  a  reason  main.enance  system  in  which  assumption- 
based  justifications  (ATMS)  and  non-monotonic  justifications 
can  be  directly  and  transparently  described.  The  second  pa¬ 
per  provides  formal  semantics  to  truth  maintenance  by 
offering  a  mathematical  logic  —  equipped  with  an  underlying 
model  theory  -  that  is  used  to  characterize  well  known 
models  of  truth  maintenance.  The  third  paper  describes  an 
experiment  in  using  an  assumption-based  and  non¬ 
monotonic  reasoning  capability  in  support  of  strategic  analy¬ 
sis. 
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Abstract 

We  have  recast  the  problem  of  t  uth  maintenance  in 
a  setting  of  algebraic  equations  over  Boolean  lattices. 

If  a  method  of  labeling  propositions  to  justify  them 
according  to  some  reasoning  agent’s  constraints  of  be¬ 
lief  happens  to  conform  to  the  postulates  of  Boolean 
lattices,  the  labeling  system  can  be  reformulated  as 
an  algebraic  equation  solving  system.  All  truth  main¬ 
tenance  systems  known  to  us  can  be  so  reformulated. 

This  note  summarizes  our  investigations  into  the  ex¬ 
istence  and  structure  of  solutions  of  these  algebraic 
systems.  Our  central  result  is  a  unique  factorization 
theorem  for  lattice  equational  systems  and  their  solu¬ 
tions.  Our  theoretical  results  are  interpreted  to  com¬ 
pare  various  styles  of  truth  maintenance  and  to  reveal 
certain  computational  difficulties  implicit  in  the  alge¬ 
braic  structure  of  truth  maintenance. 

I.  Introduction 

Lattice-theoretic  truth  maintenance  is  a  single  theoretical 
framework  that  subsumes  various  notions  of  truth  main¬ 
tenance,  including  the  assumption-based  justifications  re¬ 
ported  by  de  Kleer  [de  Kleer,  1984,  de  Kleer,  1986a,  de 
Kleer,  19S6b,  de  Kleer,  1986c]  and  the  nonmonotonic  jus¬ 
tifications  reported  by  Doyle  [Doyle,  1979a,  Doyle,  1979b, 
Doyle,  1978]  and  Goodwin  [Goodwin,  1982,  Goodwin, 
1985,  Goodwin,  1984,  Goodwin,  1987].  Our  complete  body 
of  work  on  lattice-theoretic  truth  maintenance  '  icludes 

•  An  analysis  of  the  algebraic  structure  of  truth  main¬ 
tenance 

•  An  investigation  of  the  abstract  and  concrete  compu¬ 
tational  complexity  of  truth  maintenance 

•  A  formal  account  of  the  embedding  of  other  forms  of 
truth  maintenance  in  the  lattice-theoretic  paradigm 

In  this  note  we  focus  on  the  first  aspect,  because  of  its 
intrinsic  interest,  and  because  this  aspect  is  a  precursoi  to 
the  others.  Our  express  aim  here  is  to  present  the  lattice- 
theoretic  account  of  truth  maintenance,  cite  the  more  im¬ 
portant  algebraic  results  vis  a  viz  this  account,  and  int.oi- 
pret  these  results  so  as  to  cast  a  qualitative  light,  on  various 
computational  considerations  of  truth  maintenance.  Bend¬ 
ers  interested  in  other  aspects  of  our  theoretical  work  or 
our  practical  experience  with  an  implementation  embody¬ 
ing  this  theory  are  referred  to  [Benanav  r.t  al.,  19SG] 


The  initial  motivation  for  this  work  was  the  desire 
t,o  unify  in  a  single  abstraction  the  truth  maintenance 
paradigm  of  Doyle  and  Goodwin,  and  that  of  de  Kleer. 
The  systems  of  these  investigators  can  be  viewed  as  con¬ 
straint  propagation  mechanisms.  Given  a  disjunctive  set 
of  sets  of  premises  and  a  set  of  (monotonic)  deductive  con¬ 
straints,  de  Kleer’s  ATMS  tells  a  client  problem  solving 
system  what  things  it  is  currently  obliged  to  believe,  as¬ 
suming  one  or  another  of  the  sets  of  premises.  Doyle’s  and 
Goodwin’s  TMS’s,  on  the  other  hand,  tell  the  client  prob¬ 
lem  solving  system  what  things  it  is  currently  obliged  to 
believe,  given  a  single  set  of  premises  under  deductive  con¬ 
straints,  some  of  which  may  be  nonmonotonic  in  nature.* 
Our  original  intuition  was  that  it  should  be  possible  to 
aecount  simultaneously  for  multiple  sets  of  premises  and 
nonmonotonic  deductive  constraints.”’*** 

This  intuition  arose  from  the  striking  similarity  ob¬ 
served  between  the  computations  of  truth  maintenance 
systems  and  the  computations  of  global  flow  analysis 
that  underly  modern  optimizing  compilers  [Aho  and  Ull- 
man,  1977,  Hecht,  1977,  Schaeffer,  1973,  Waite  and  Goos, 
19^4].  Global  flow  analysis  can  be  couched  in  the  following 
terms:  Given  the  constraints  imposed  by  individual  pro¬ 
gram  statements  and  their  interconnecting  topology,  what 
facts  is  a  reasoning  agent  (in  this  case  concerned  with  pro- 


*A  monotonic  deductive  constraint  obliges  a  rational  agent  to  be¬ 
lieve  its  consequent,  given  that  it  currently  believes  all  of  its  an¬ 
tecedents,  A  nonmonotonic  deductive  constraint  obliges  a  rational 
agent  to  believe  its  consequent  given  that  it  believes  all  of  its  mono- 
tonic  antecedents  and  none  of  its  nonmonotonic  antecedents. 

“The  intellectual  challenge  of  unifying  these  two  approaches  to 
truth  maintenance  is  sufficient  motivation  for  proceeding.  Nonethe¬ 
less,  we  note  that,  de  Kleer  [de  Kleer,  1988b],  and  Morris  and  Nado 
[Morris  and  Nado,  1986]  are  practically  motivated  to  augment  their 
assumption-based  truth  maintenance  systems  to  support  some  form 
of  nonmonotonic  justification,  lit  our  approach  nonmonotonicity  will 
be  “built-in”  rather  than  "added-on”.  Although  we  will  not  do  so 
here,  it  can  he  shown  that  our  conceptually  parsimonious  approach 
is  at  a  computational  advantage  relat  ive  to  the  at  tempts  of  de  Kleer, 
and  Morris  and  Nado. 

•••We  have  recently  been  made  aware  of  the  work  of  McDermott 
[McDermott,  1983]  whose  perspective  on  truth  maintenance  has 
much  in  common  with  our  own.  Indeed,  his  concrete  solution  to  what 
we  will  eventually  define  as  even  equational  systems  appears  to  be 
identical  to  ours,  though  arrived  at  from  a  quite  different  point  of  de¬ 
parture.  Our  investigation  is  broader  in  both  the  scope  of  equational 
systems  investigated,  and  in  the  characterization  of  those  systems’ 
structures  and  solution  spaces. 


grams)  obliged  to  believe  about  the  state  of  computation 
at  various  points  in  the  program’s  control  flow?  In  a  sense 
the  information  propagation  problem  solved  by  global  flow 
analysis  can  be  viewed  as  the  dual  of  the  truth  maintenance 
problem.  The  former  assigns  propositions  to  contexts  es¬ 
tablished  by  various  paths  through  a  program.  The  latter 
assigns  contexts  of  belief  to  propositions  under  various  de¬ 
ductive  constraints.  There  are  two  principal  methods  of 
solving  information  propagation  problems.  Both  hinge  on 
solving  systems  of  equations  whose  unk  lowns  range  over 
the  domain  of  an  algebraic  lattice.  The  work  that  we 
will  describe  presently  retains  the  idea  of  equations  over  a 
lattice,  but  for  various  technical  reasons  (principally  non¬ 
monotonic  constraints)  the  solution  methods  used  in  global 
flow  analysis  are  inappropriate.  A  rather  different  solution 
method  has  been  developed. 

II.  Lattice  Equational  Systems 

Let  B  be  a  Boolean  lattice  equipped  with  the  usual 
meet,  join,  and  complementation  operators;  a  partial  or¬ 
der,  <;  and  maximum  and  minimum  elements,  T  and  _L, 
respectively.*  A  complete  account  of  such  structures  can 
be  found  in  any  of  [Balbes  and  Dwinger,  1974,  Birkhoff, 
1967,  Skornjakov,  1977].  Elements  of  B  will  be  called  situa¬ 
tions,  and  will  be  denoted  by  .4  and  B.  A  and  B  (possibly 
subscripted)  are  lattice  expressions  in  B.  Moreover,  if  A 
and  B  are  expressions  in  B  then  so  are  A  V  B,  A  A  B,  A 
and  B.  Especially  important  to  us  will  be  the  existence 
of  the  partial  order,  the  complement,  maximum  and  mini¬ 
mum  elements,  and  the  mutual  distributivity  of  meet  and 
join. 

A  lattice  unknown  is  a  super-  and/or  subscripted  s. 
Each  lattice  expression  in  B  and  unknown  is  a  lattice  form 
in  B.  Moreover,  if  X  and  Y  are  forms  in  B  then  so  are 
X  V  Y ,  X  A  Y ,  A'  and  Y.  Individual  (fixed)  lattice  forms 
in  B  will  be  denoted  by  A'  and  Y ,  oossibly  subscripted. 
Every  fact,  or  proposition  has  an  associated  unknown.  Note 
that  a  proposition  and  its  negation  have  distinct  associated 
unknowns.  Indeed,  an  unknown  corresponds  exactly  to  a 
node  as  that  term  is  used  by  Doyle,  Goodwin,  and  de  Klccr. 
A  lattice  equation  over  B  is  a  relation  of  the  form  A’  =  5’ 
where  A’  is  a  lattice  unknown  and  1  is  a  lattice  form. 
A  lattice  equational  system  over  B,  E,  is  any  collection 
of  lattice  equations  over  B  such  that  the  total  number  of 
lattice  unknowns  occurring  on  the  right-hand  sides  of  the 
equations  is  finite  and  any  lattice  unknown  occurs  at.  most 
once  on  the  left-hand  side  of  an  equation.  The  equation  on 
whose  left-hand  side  .s  appears  will  be  called  the  .<t  equation. 

E  will  be  sub-  or  superscripted  when  it  is  useful  to 
distinguish  among  various  equational  systems.  Unless  the 
context  is  ambiguous,  we  will  freely  say  ‘system  without 
modifiers.  A  lattice  equational  system  should  be  inter¬ 
preted  as  encoding  the  way  a  reasoning  agent’s  belief  (or 


*In  this  report  we  assume  6  to  be  a  recursive  set,  its  operators  to 
be  total  recursive  functions,  and  its  partial  order  to  be  a  recursive 
relat  ion 


disbelief)  in  a  collection  of  propositions  entails  belief  in 
others.  If  E  is  a  lattice  equational  system  such  that  the 
right-hand  side  of  each  equality  is  of  the  form  \J ,  A  >  %ij 
where  each  X,j  is  an  element  of  B  or  an  unknown  ( pos¬ 
sibly  complemented),  then  E  is  said  to  be  in  disjunctive 
norm,al  form.*  Since  we  can  transform  any  form  into  dis¬ 
junctive  normal  form,  we  will  usually  treat  forms  over  B 
and  lattice  equational  systems  as  if  they  were  in  disjunctive 
normal  form. 

A  solution  to  a  lattice  equational  system,  E,  is  a  func¬ 
tion,  T,  from  the  lattice  unknowns  into  B  such  that  if  for 
each  equation  in  the  system,  each  unknown  ,s  in  the  equa¬ 
tion  is  replaced  by  T(.s)  the  equation  holds  in  B.  Moreover, 
T  takes  any  unknown,  s,  not  on  the  left-hand-side  of  some 
equation  in  E  into  ±,  and  in  that  regard  the  system  E  im 
plicitly  has  the  equation  s  =  ±.  We  will  interpret  lattice 
equations  as  constraints.  A  solution,  then,  is  a  labeling  of 
propositions  with  situations.  In  particular,  the  situations 
are  those  in  which  a  reasoning  agent  is  obliged  to  believe 
the  correspondingly  labeled  proposition  given  acceptance 
of  the  constraints  imposed  by  the  system.  We  will  often 
subscript  T’  with  the  name  of  the  system  of  which  it  is  a. 
solution.  A  justification  of  a  disjunctive  normal  form  lat¬ 
tice  equational  system,  E,  is  an  ordered  pair,  d  =  ( s,X ), 
where  s  appears  on  the  left-hand  side  of  some  equation 
in  E  and  A'  is  a  disjunct  on  the  right-hand  side  of  that 
same  equation.  Also,  s  is  called  the  consequent  of  the  jus 
tification  d  and  each  conjunct  of  the  disjunct  A'  is  called 
a  nonmonotonic  or  monotonic  antecedent  of  d  depending 
on  whether  or  not  it  is  complemented.  The  sets  of  mono¬ 
tonic  and  nonmonotonic  antecedents  of  d  are  respectively 
denoted  n(d)  and  6(d).  A  justification,  d,  is  valid  with  re¬ 
spect  to  a  situation,  A ,  and  a  solution,  T,  of  an  equational 
system  E  if  and  only  if, 

A<  /\  T(s)a  /\  F(7j 

s£fv(rf)  s&oi(d) 

We  will  write  Valid(  A,<7,T )  to  indicate  that  d  is  valid 
with  respect  to  .4  and  solution  T.  A  solution,  F,  is  well- 
founded  with  respect  to  a  lattice  equational  system ,  E,  at 
lattice  unknown,  s,  if  and  only  if  either  T(.s)  =  _L,  or  F(s)  = 
\JtA,,  and  for  each  .4,,  there  is  a  partially  ordered  set. 
(Va,,  -<a,)i  such  that  V,\,  is  a  set  of  justifications  from  E 
and 

1.  There  is  a  justification,  d  in  V\, .  whose  consequent  is 

2.  For  every  justification  d.  in  Valid(.4,,<7,r) 

3.  Every  unknown,  s',  that  is  a  monotonic  antecedent  of 
some  d  in  V ,\,  is  also  the  consequent  of  some  justifi¬ 
cation  d1  in  V,\,  and  d'  -A,,  d 


*\\V  tiw  disjunctive*  normal  form  for  notational  convenience. 
While  its  existence  is  required  in  establishing  some  of  the  formal  re¬ 
sults  that  we  cite,  it  plays  no  essential  role  in  lat  t  ice-t  hemet  ic  truth 
maintenance  coiuput  at  ions. 
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A  solution  to  a  lattice  equational  system  is  well-founded,  if 
find  only  if  it  is  well-founded  with  respect  to  the  system  at 
every  lattice  unknown  mentioned  in  the  system. 

We  interpret  justifications,  validity  and  well-founded- 
ness  in  the  following  way:  Validity  describes  the  circum¬ 
stances  under  which  the  consequents  of  a  justification  are 
to  be  believed  given  the  belief  status  of  the  antecedents.  A 
justification  therefore  constitutes  an  independent  source  of 
support  justifying  belief  in  a  consequent.  Chaining  justifi¬ 
cations  together  constitutes  a  supporting  argument.  Since 
we  wish  our  arguments  to  be  noncircular,  we  impose  an 
additional  condition,  well-foundedness,  to  guarantee  that 
state  of  affairs. 

Let  us  first  consider  some  uninterpret  ’d  equational 
systems,  all  taken  to  be  over  the  Boolean  lattice,  B,  having 
at  least  two  distinct  elements:  The  system  E| 

s  =  s 

has  one  well-founded  solution,  {.s  =  _L} .  On  the  other 
hand,  any  of  {{.s  =  A}|A  /  _L  and  A  6  0}  are  also  solu¬ 
tions,  though  not  well-founded.  The  system  E2 

s  =  s 

the  classical  “odd  loop”  of  Doyle’s  TMS,  has  no  solutions, 
well-founded  or  otherwise.  The  system  S3 

si  =  s2 

S2  =  Sj 

has  one  well-founded  solution,  {sj  =  i.,s2  =  -L).  The 
system  E4 

si  =  s2 
s2  =  Sl 

has  well-founded  solutions  {{si  =  A,  s2  =  A}|A  €  £>}. 
The  system  L5 

51  =  S]  V  s2 

52  =  •''2  V  .S 1 

has  a  single  solution  {.si  —  T,s2  =  T),  and  it  is  not  well- 
founded.  Finally,  S6 

■s  1  =  s  2  A  s  3 

»2  =  sj  A  s3 

s*3  =  S 1  A  s2 

lias  well-founded  solutions  { { .s 2  =  _L,s2  =  A,. S3  =  A}|A  6 
B}  U  { { .s  1  =  A,  s2  =  ±,.s3  =  A}|A  6  B}  U  {{si  A.s2  = 
A,  s3  =  _L }  |  A  S  B }.  If  B  =  {T.l}  and  interpreting  T  as 
“IN”  and  A  as  “OUT”,  it  should  be  apparent  to  readers 
familiar  with  the  TMS’s  of  Doyle  and  Goodwin  how  lattice 
equational  systems  correspond  to  their  TMS  nodes  and 
justifications. 


The  correspondence  with  de  Kleer’s  ATMS  is  a  little 
harder  to  convey,  and  we  shall  attempt  only  an  approx 
imation  here.*  We  shall  do  this  by  actually  interpreting 
a  lattice  equational  system  with  respect  to  a  toy  applica¬ 
tion.  Imagine  a  simple  series-connected  circuit  consisting 
of  a  voltage  source,  V,  of  5  volts  connected  to  resistor  Ri  at 
node  r»i,  which  in  turn  is  connected  to  resistor  R2  at  node 
n2,  which  is  connected  to  ground.  The  application  is  a  pro¬ 
gram  that  diagnoses  ground  faults  in  electrical  circuits.  In 
its  truth  maintenance  database  it  has  the  following  system 
of  equations,  E7 : 

si  =  A 

s2  =  B\ 

S3  =  B2 

s4  =  Sj  A  s2  A  s3 

The  situations  A,  B\,  and  B2  respectively  correspond  to 
the  assumption  that  the  voltage  source,  V,  and  resistors, 
R\  and  f?2,  are  working,  si  corresponds  to  the  proposition 
that  voltage  at  node  nx  is  held  at  5  volts.  s2  corresponds 
to  the  conjunctive  proposition  that  the  current  into  the 
resistor  and  node  ri\  is  the  same  as  the  current  out  of  the 
resistor  R\  at  node  n2  and  that  the  voltage  drop  across 
the  resistor  is  the  product  of  its  resistance  and  the  cur¬ 
rent  through.  S3  corresponds  to  the  conjunctive  proposi¬ 
tion  that  the  current  into  the  resistor  R2  at  node  n2  is 
the  same  as  the  current  out  of  the  resistor  at  ground  and 
that  the  voltage  drop  across  the  resistor  is  the  product  of 
its  resistance  and  the  current  through.  Finally,  s4  corre¬ 
sponds  to  the  proposition  that  the  voltage  at  node  n2  is  the 
product  of  5  volts  and  the  resistance  of  R2  divided  by  the 
sum  of  the  resistances  of  R\  and  R2.  The  equations  can 
now  be  interpreted  as  saying  that  the  propositions  associ 
ated  with  sj,  s2,  and  .s3  hold  whenever  the  corresponding 
assumptions  can  be  believed.  The  proposition  associated 
with  s4  is  believed  whenever  the  propositions  associated 
with  Si,  s2,  and  .s3  arc  believed.  A  solution  to  this  sys¬ 
tem  will  tell  us  the  circumstances  under  which  the  various 
propositions  are  to  Ire  believed.  Since  the  well-founded  so¬ 
lution  is  {si  =  A,s2  =  Busa  =  B2ys.\  =  A  A  B\  A  B2],  a 
reasoning  agent  believes  the  propositions  associated  with 
s1,s2,s3,and  ,s4  in  situations  whose  meets  with  (respec¬ 
tively)  A,Br,B2,  and  A  A  Bt  A  B2  are  not  1. 

In  the  foregoing  examples  we  have  made  implicit  use 
of  the  fact  that  any  set  of  TMS  or  ATMS  justifications 
has  equivalent  renderings  in  the  lattice-based  formaliza¬ 
tion.  For  our  last  example  we  consider  the  classical  prob¬ 
lem  of  adding  facts  to  or  deleting  facts  from  worlds  or 
states.  To  begin  with,  we  interpret  situations  as  worlds  or 
c.ates.  We  have  already  asserted  that  every  fact  or  propo¬ 
sition,  p,  has  an  associated  unknown,  say  s(, .  We  will  also 
posit  additional  unknowns,  and  corresponding  to  the 

*  Headers  interested  in  t  lie  precise  details  of  encoding  these  other 
truth  maintenance  systems  in  tin*  lattice-theoretic  paradigm  should 
consult  [Benanav  et  ai . ,  l!)8ti]. 


beliefs  (respectively)  that  p  has  been  added  and  that  p  has 
been  deleted.  Consider  now  the  system  of  equations 

sa  =  A 
Sd  =  B 

Sp  —  A  Srf. 

The  well-founded  solution  of  this  system  is  {sa  =  A,Sd  = 
D,  sp  =  A  A  B}.  Our  interpretation  of  this  solution  is  that 
a  reasoning  agent  believes  p  just  in  ease  he  believes  himself 
to  be  in  a  world  or  state  whose  meet  with  A  A  B  is  not  J_. 
In  addition,  we  can  use  the  lattice  partial  order  to  encode 
inheritance  among  worlds.  Notice  that  the  fact  p  will  be 
added  to  any  world,  A',  sueh  that  A!  <  A  A  B,  and  deleted 
from  any  world,  B1 ,  such  that  B'  <  B. 

III.  The  Existence  of  Solutions 

We  have  seen  how  lattice-theoretic  truth  maintenance  is 
connected  to  some  well  known  models  of  truth  mainte¬ 
nance;  we  now  turn  to  the  challenge  of  actually  solving 
truth  maintenance  problems  is  this  new  paradigm.  We 
have  already  seen  in  £2  that  solutions  need  not  exist, 
but  even  if  they  do  (as  in  £5),  there  may  not  be  well- 
founded  ones.  It  is  well  known  that  general  polynomial 
equations  in  rational  coefficients  cannot  be  solved  by  ap¬ 
plying  the  operations  of  addition,  multiplication,  and  ra¬ 
tional  root  extraction  to  their  coefficients  [Birkhoff  and 
MacLane,  1965,  van  der  Waerden,  1953].  By  analogy  we 
might  ask  about  the  solvability  of  lattice-equational  sys¬ 
tems  by  taking  meets,  joins,  and  complements  of  lattice 
expressions  appearing  in  the  equations.  Put  another  way, 
could  it  !'e  he  case  that  the  equations  are  not  solvable  by 
applying  the  obvious  operations  to  the  available  data?  To 
answer  this  question  we  must  first  formalize  our  notion  of 
the  ‘available  data’. 

A  surface  element  of  a  lattice  equational  system  £  is 
an  element  of  B  that  actually  appears  in  £.  The  Boolean 
lattice  generated  by  meets,  joins,  and  complements  over 
the  surface  elements  is  called  the  surface  lattice.  An  atom 
of  the  Boolean  lattice,  B.  is  any  element  A  6  B,  A 
_L,  such  that  there  is  no  B  €  B  satisfying  A  >  B  >  X. 
A  lattice  is  atomic  if  each  of  its  elements  is  the  join  of 
atoms.  If  £  contains  only  a  finite  number  of  equations,  the 
number  of  surface  elements  is  finite  and  thus  the  surface 
lattice  is  atomic.  The  atoms  of  this  lattice  will  be  called 
surface  atoms.  A  surface  solution  is  one  such  that  for  every 
lattice  unknown,  s.  that  appears  in  £,  T(s)  is  in  the  surface 
lattice.  Consider  again  the  system,  £.).  Note  that  it  has 
many  possible  well-founded  solutions  (depending  on  the 
Boolean  lattice  with  respect  to  which  the  system  is  being 
interpreted)  of  which  only  two,  { .s  1  =  T,.s2  =  -L}  and 
{,sq  =  _L,.s2  =  T)  are  surface.  Our  question  posed  in  the 
last  paragraph  is  answered  by  the  following: 

Theorem  III.l  If  a  finite  lathee  equational  system  £ 
over  B  has  a  well-founded  solution,  then,  it  has  a  wc.ll- 
founded  surface  solution. 


Thus  we  see  that  if  there  are  any  well-founded  solutions  at 
all,  we  are  guaranteed  that  some  of  them  can  be  computed 
by  taking  meets,  joins,  and  complements  over  the  available 
data. 

Thus  far  we  have  established  a  framework  within 
which  we  can  formally  describe  truth  maintenance  prob¬ 
lems  and  within  which  solutions  can  be  eonneeted  with 
the  available  data  in  the  equations.  For  this  framework  to 
be  truly  useful  we  must  provide  a  way  of  finding  solutions 
other  than  by  blindly  enumerating  candidates  and  testing 
them.  Suppose  we  eould  obtain  a  solution.  How  do  we 
know  that  this  is  the  only  solution?  Or  even  the  only  sur¬ 
face  solution?  To  convey  some  idea  of  tne  challenge  of  this 
problem  consider  the  system  Eg 

s  =  s  A  f\  s2k 

l<k<n 

s,  =  sVs2 
s2  =  Si 

S2n  —  1  —  S  V  S2n 

S2n  =  Sjn-J. 

This  system  has  2”  well-founded  surface  solutions,  s  is 
always  -L  and  we  are  free  to  choose  i  or  T  as  the  value 
of  each  of  the  odd-indexed  unknowns.  The  usual  method 
of  solving  algebraic  equational  systems  is  by  using  sub¬ 
stitution  together  with  other  “legal”  (with  respect  to  the 
algebraic  system  in  question)  transformations  to  produce 
a  new  lattice  equational  system  whose  solutions  are  also 
solutions  of  the  original.  Because  of  the  algebraic  nature 
of  the  meet  and  join  operators  there  is  no  obvious  way 
of  effecting  such  a  transformation.  Before  introducing  the 
more  novel  transformation  that  we  will  need,  let  us  first 
formalize  the  notion  of  substitution  that  we  will  be  using. 

Let  £_,  be  £  less  its  &  equation.  Systems  will  be  pre¬ 
sented  always  according  to  some  fixed  lexical  order.  This 
is  possible  since  each  system  is  obviously  u  recursive  set. 
Hence  it  is  reasonable  to  speak  of  the  nlh  occurrence  of 
the  unknown,  s,  on  the  right-hand  sitlc  of  an  equation  in 
£.  We  define  a  local  substitution.  cr,.„.  as  follows:  If  the  s' 
equation  is  he  locus  of  the  a11'  occurrence,  then  cr  .,,„(£) 
is  £_.,'  together  with  a  new  s'  equation  wherein  the  11" 
occurrence  of  s  is  replaced  by  the  right-hand  side  of  the 
s  equation  in  £.  If  there  is  no  tt1'  then  <r,.„(£)  =  £. 
Suppose  there  are  k  s‘s  in  £  before  t.ie  s  equation,  nt  in 
the  ,s  equation,  and  »i  after  the  ,s  oqua.ion.  The  (global) 
substitution  transformation  of  £  under  cr ,„(£),  is 

cr s,  \  ( * '  ’  cr 8 .i.-  —  1  (  cj „ p  ( tx .. ,r--f  m+ 1 

(  '  *  '  C7 8 .k 4  in  +  n  1  t  Cl 8 1,1  +  n  (  £) )  '  '  ' ) ) )  '  '  ' ) 

This  transformation  has  the  effect  of  leplacing  every  right 
hand  side  occurrence  of  s  (except  those  in  the  s  equation) 
with  the  right-hand  side  of  the  s  equation.  The  follow¬ 
ing  lemma  suggests  the  other  transformation  necessary  for 
computing  solutions. 


Lemma  IJI.l  Let  X\,  X2  and  X3  be  expressions  having 
no  occurrence  of  s,  then 

s  =  X\  V  (X2  A  a)  V  (X3  A  a) 

(Xi  VX3)<a<(X,  VI2) 

(r(A",)  v  r(x3))  <  r(a)  <  (r(x,)  v  r(x2)) 

v/here  T  is  a  solution  of  a  system,  including  the  a  equation. 
Let  the  a  equation  be  rearranged  to  have  the  form:  a  = 
Xi  V  (X2  A  a)  V  (X3  Vs).  The  distributive  nature  of  the 
Boolean  lattice  guarantees  that  we  can  always  do  this.  The 
minimization  transformation  of  E  under  a,  p,(E),  is  E_a 
together  with  the  equation  a  =  Xi  VJj.  This  transforma¬ 
tion  is  semantically  equivalent  to  having  substituted  J.  for 
every  occurrence  of  a  on  the  right-hand  side  of  the  a  equa¬ 
tion.  Or  put  another  way,  we  are  taking  the  lower  bound 
of  the  solution  interval  defined  in  the  previous  lemma. 

We  ask  then  whether  or  not  minimization  and  sub¬ 
stitution  can  be  used  to  produce  a  solution.  The  answer 
will  always  be  in  the  affirmative  for  an  important  class*  of 
equational  systems.  We  may  apply  these  transformations 
t<  the  original  system  of  equations  in  such  a  way  as  to  yield 
a  new  system  free  of  unknowns  on  the  right-hand  side.  The 
resulting  system  constitutes  a  solution  for  the  original  sys¬ 
tem  of  equations.  Before  discussing  the  exact  method  of 
applying  these  transformations,  however,  we  offer  the  fol¬ 
lowing  apparently  technical  but  actually  qualitatively  im¬ 
portant  result  about  minimization  and  substitution. 

Lemma  III. 2  A  well-founded  solution,  T,  of  a  lattice 
equational  system,  E,  is  a  well-founded  solution  of  1 t,(E) 
unless  <ra  involves  a  local  substitution  for  a  complemented 
occurrence  of  3.  If  p,(E)  =  E',  then  a  well-founded  solu¬ 
tion  T'  of  E'  is  a  well-founded  solution  of  S. 

A  close  examination  of  the  pro  :f  would  reveal  that  the 
substitution  operation  has  the  property  that  it  preserves 
“solution-ness”  but  may  lose  well-foundedness.  On  the 
other  hand,  minimization  preserves  well-foundedness,  in 
the  sense  that  any  well-founded  solution  to  the  original 
system  that  persists  in  being  a  solution  to  the  transformed 
system  is  still  well-founded.  Consider  first  rhe  system  E.|. 
Applying  the  transformation  cr32  yields  the  system  E^ 

3]  =  Si 

.32  =  *1 

which  has  the  same  surface  solutions  as  S.i,  but  only  the 
second  of  them  is  well-founded.  Applying  to  E',  yields 

.s,  =  1 

S2  =  .3] 

which  Las  only  one  solution,  {si  =  _L,.s2  -  T}. 

A  process  for  E  is  any  functional  composition  of  min¬ 
imizations  and  substitutions.  /i„,  <r,,„  and  rr,  are  all  pro¬ 
cesses  for  any  lattice  unknown,  3.  If  7f  is  a  process,  so 

'The  well-known  truth  maintenance  systems  in  the  literature  only 
guarantee  well-founded  solutions  for  even  equational  systems. 


are  //,  o  tr,  aa<n  o  tt  and  a s  o  7r.  A  terminal  process  for 
E,  denoted  r,  is  a  process  such  that  for  every  process  tt, 
rotr(E)  =  r(E). 

A  path  0]  l-ngth  n  from  s0  to  sn  is  a  sequence  of  triples 
of  the  form 

(X1,Yi,Sl),{X2,Y2,s2),. . .  ,{Xn,Yn,sn) 

where  X,  €  {.s,-.i,  Si-i},  X’,-  is  an  antecedent  of  the  Y{  dis¬ 
junct  of  the  s,  equation  in  E,  and  1  <  i  <  n.  X,-  is  a 
complemented  (uncomplemented)  unknown  if  it  is  a  com¬ 
plemented  (uncomplemented)  conjunct  of  Yj.  Unknown  s 
is  connected  to  unknown  s'  if  there  is  a  path  of  any  length 
from  s  to  s'.  A  path  is  odd  if  it  has  an  odd  number  of 
complemented  unknowns  and  even  otherwise.  A  system  is 
odd  (and  even  otherwise)  if  it  has  an  unknown,  s,  and  an 
odd  path  from  s  to  s. 

Theorem  III. 2  Every  even  lattice  equational  system,  E, 
has  a  terminal  process,  r,  such  that  Tt(£)  13  a  well-founded 
solution  of  E. 

An  immediate  consequence  of  the  previous  theorem  is 
an  algorithm  that  is  analogous  to  Gaussian  elimination 
[Birkhoff  and  MacLane,  1965,  Gantmacher,  1959]  that  will 
always  produce  a  well-founded  solution  for  an  even  lattice 
equational  system: 

1 .  Let  x  be  a  LIFO  queue  of  the  equations  in  E;  let  y  be 
an  empty  LIFO  queue  of  equations  in  E;  let  z  be  an 
equation  in  E 

2.  Until  x  is  empty,  x  becomes  cr s(^s(x))  (the  order  of 
equations  remaining  invariant  with  respect  to  their 
left-hand  sides)  where  ,s  is  the  unknown  on  the  left- 
hand  side  of  the  first  equation  in  the  queue,  dequeue 
x  to  z ,  enqueue  z  to  y 

3.  Until  y  is  empty,  y  becomes  cr,(y)  (the  order  of  equa¬ 
tions  remaining  invariant  with  respect  to  their  left- 
hand  sides)  where  s  is  the  unknown  on  the  left-hand 
side  of  the  first  equation  in  the  queue,  dequeue  y  to 
z,  enqueue  z  to  x 

4.  End 

This  algorithm  terminates  with  .r  being  a  queue  of  equa¬ 
tions  with  constant  right-hand  sides  (a  solution).  Step  2  is 
the  analogue  of  forward  elimination;  stop  3  corresponds  to 
back  substitution.  Each  (unspecified)  order  in  which  un¬ 
knowns  are  removed  from  the  queue,  x,  is  an  elimination 
sequence.  Every  such  sequence  produces  a  well-founded  so¬ 
lution  (for  an  even  system )  and  it  may'  be  the  only  sequence 
that  produces  that  particular  solution.  I11  re-examining  E4 
above  we  implicitly  applied  this  algorithm,  first  eliminating 
.3!  c  id  thereby  generating  the  first  of  the  two  well-founded 
surface  solutions.  Had  we  done  the  other  elimination  first, 
we  would  have  obtained  the  second  well-founded  surface 
solution.  Can  we  generate  all  of  the  surface  solutions  by- 
varying  the  order  of  elimination?  Unfortunately  the  an¬ 
swer  is  no  iis  can  be  seen  by  examining  the  following  sys¬ 
tem,  E9: 

1  =  .4  A  .v2 
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8  2  =  (AASi)Vs3 

63  =  32 

only  three  of  whose  four  well-founded  solutions  can  be  pro¬ 
duced  by  varying  the  order  of  elimination.  As  suggested  by 
the  statement  of  the  theorem,  there  are  odd  lattice  equa- 
tional  systems  some  of  whose  elimination  sequences  do  not 
produce  solutions.  Such  a  system  is  EJ0: 

sr  =  s2 

S2  =  S] 

«3  =  Sr  A  s3. 

Addressing  either  of  the  aforementioned  deficiencies  re¬ 
quires  the  separability  results  of  the  next  section. 

IV.  The  Structure  of  Systems 
and  Solutions 

In  this  section  we  discuss  ome  separability  results  for  lat¬ 
tice  equational  systems.  These  results  are  of  two  classes: 
topological  and  algebraic.  They  are  important  because 

•  They  provide  the  machinery  from  which  all  surface 
solutions  to  all  equational  systems  may  be  generated 

•  They  provide  the  basic  perspective  for  analyzing  the 
abstract  complexity  of  the  truth  maintenance  problem 

•  They  suggest  concrete  “divide  and  conquer”  algo¬ 
rithms  for  solving  the  truth  maintenance  problem  by 

-  supporting  “lazy  evaluation”  of  the  solution  vis  a 
vis  any  given  unknown 

-  supporting  incremental  update  of  solutions  as  jus¬ 
tifications  are  added  or  removed 

-  enabling  paralle’  r  .nition  methods 

Let  the  equations  of  E  be  partitioned  into  equivalence 
classes  in  such  a  way  that  the  s,  and  s2  equations  of  S  will 
be  in  the  same  equivalence  class  S',  a  subsystem  of  E,  just 
in  case  ,<n  is  connected  to  s2  in  S  and  s2  is  connected  to 
si  in  E.  Each  equivalence  class  E'  is  a  strongly  connected 
subsystem  of  S.  A  partial  order,  <v.  ran  be  defined  on 
the  strongly  connected  subsystems  of  an  equational  sys¬ 
tem  such  that  for  subsystems  S'  and  S",  E'  <v  E"  if  and 
only  if  there  is  an  s'  equation  of  E'  and  an  s''  equation  of 
E''  such  that  s'  is  connected  to  s'',  E"  is  minimal,  if  and 
only  if  there  exist.-  no  E'  such  that  S'  <v  E".  Given  a 
partitioning  of  a  lattice  equational  system  E  into  strongly 
connected  subsystems,  there  always  exists  at  least  one  min¬ 
imal  strongly  connected  subsystem  of  E. 

Proposition  1V.1  Every  lattice,  equational  system  can  be 
partitioned  into  a  partial  order  of  strongly  connected  sub¬ 
systems.  Moreover,  cadi  of  these  subsystems  can  be  treated 
as  if  the  unknowns  whose  corresponding  equations  are  in 
other  strongly  connected,  subsystems  were  lattice  elements 
fl.c.,  constants). 


Since  we  can  solve  each  of  the  partitions  separately, 
treating  the  unknowns  whose  corresponding  equations  are 
not  in  the  strongly  connected  subsystem  being  solved  as 
if  they  were  expressions  from  the  surface  lattice  (i.e.,  con¬ 
stants),  we  can  pursue  a  strategy  of  lazy  evaluation  and  in 
cremental  update.  Not  only  can  the  system  be  partitioned 
very  efficiently  [Aho  et  al.,  1974],  but  there  are  also  effi 
cient  methods  of  updating  this  partially  ordered  partition 
as  justifications  are  added  (and  deleted).  As  things  change 
in  the  truth  maintenance  database,  we  compute  new  parti 
tions  and  only  re-solve  for  unknowns  whose  equations  are 
in  the  same  strongly  connected  subsystem  as  the  changed 
equation,  and  (optionally)  for  unknowns  in  greater  subsys¬ 
tems.  For  example,  consider  Ei0.  This  system  partitions 
into  two  strongly  connected  subsystem .  The  s,  and  s2 
equations  form  the  first  and  lesser  (in  <y10)  sub  ‘cm; 
the  second  strongly  connected  subsystem  consists  of  the 
s3  equation.  We  can  solve  the  second  subsystem,  treating 
it  as  if  s,  were  a  constant.  Note  that  the  s3  equation  has  a 
solution  only  in  the  case  that  s,  =  s3  ~  1.  This  “forces” 
the  solution  of  the  first  subsystem  to  he  {s,  =  ±,s2  =  T). 

A  lattice  equational  system,  E,  is  reduced  if  and  only 
if  for  every  pair  of  unknowns  s  and  s'  whose  equations  are 
in  the  same  strongly  connected  subsystem  of  E  such  that  s 
is  an  antecedent  of  s'  in  E,  s  is  a  nonmonotonic  antecedent 
of  s'. 

Theorem  IV. 1  For  every  lattice  equational  system,  E, 
there  is  a  process,  n,  such  that  E'  =  tt(E)  is  reduced,  and 
T  is  a  well-founded  solution  of  E  if  and  only  if  it  is  a  well- 
founded  solution  of  S'. 

The  utility  of  the  previous  theorem  becomes  clcare  •  .<  n 
we  combine  the  computation  of  strongly  connected  •  ibsys- 
tems  of  a  system  with  reduction  and  minimization.  Since 
we  need  to  do  only  substitutions  through  uncomplemented 
occurrences  of  unknowns  in  the  original  system  in  order  o 
reduce  it,  the  resulting  system  has  exactly  the  same  u  II 
founded  solutions  as  the  original.  Reconsider  no.v  S8.  If 
we  minimize  with  respect  to  s  and  reduce  again  we  get  th" 
reduced  system: 

s  =  1 

•**  1  =  S  2 

,s‘2  =  *1 

8-ii.-  1  =  *'2n 

•s  1 11  =  -s2j>  —  1  • 

The  system  has  now  been  separated  into  n  +  1  strongly 
connected  subsystems,  each  of  which  is  </j,iconnectod  from 
the  others,  hence  independently  solvable.  We  see  now  how 
the  2'1  solutions  arise,  since  we  have  n  repetitions  of  the 
system  E4 

If  we  combine  the  previous  two  results,  and  manipu¬ 
late  the  proof  of  theorem  III. 2  we  obtain 
Theorem  IV.2  Let  E  be  a  lattice  equational  system 
whose  surface  lattice  is  {T.  1}.  Every  well-founded,  surface 


solution  of  S  is  produced  by  some  sequence  of  minimiza¬ 
tions  and  global  substitutions. 

Thus  if  we  restrict  the  lattice  over  which  the  equations 
are  taken  to  correspond  exactly  to  the  TMS’s  of  Doyle 
and  Goodwin,  each  well-founded  solution  will  be  produced 
by  some  elimination  sequence.  Unfortunately,  among  the 
large  number  of  elimination  sequences,  many  (in  the  case 
of  odd  systems)  do  not  produce  solutions.  Since  we  and 
others  [McAllester]  have  independently  shown  the  NP- 
completeness  (in  the  size  of  the  equational  system)  of  solv¬ 
ing  this  restricted  class  of  lattice  equational  systems,  there 
can  be  no  “easy”  characterization  of  the  circumstances  un¬ 
der  which  a  particular  elimination  sequence  will  produce  a 
well  founded  solution. 

We  turn  now  to  the  question  of  how  to  find  all  the 
well-founded  surface  solutions  for  arbitrary  systems.  Let 
Ti  V  1^2,  Tj  A  r 2 ,  and  A  A  T  denote  functions  such  that 

(r,  v  r2)(s)  =  r,(s)  v  r2(s),  (r,  a  t2){s)  =  r,(s)  a  r2(s), 

and(AAr)(s)  =  AAT(s).  A  Goodwin  projection  of  a  lattice 
element  B  with  respect  to  an  atom  .4,  denoted  7 a(B),  is 
defined  by 

T  if  A  <  B 
1  otherwise. 

We  extend  the  notion  rr  a  Goodwin  projection  to  expres¬ 
sions  over  lattice  elements  by 

1a(B^B2)  -  7A(Bi)AyA(B2) 

"Ya(Bi  V  Bf)  =  Ja{Bi)V  7a(B2) 

7A(B)  =  7a{B) 

and  to  lattice  equations  by  applying  the  projection  to  each 
constant  term  in  each  equation. 

Theorem  IV. 3  Tv  is  a  well-founded  surface  solution  of 
S  if  and  only  if  there  exists  a  subset,  {X|l  <  i  <  Ar}, 
of  the  atoms  of  the  surface  lattice  of  S,  and  corresponding 
Goodviin  projections,  {74,.  |1  <  >  <  Ar),  such  that 

N 

T'j  =  \J  .4,  A  IX.  (v;) 

;=i 

where  the  are  well-founded  surface  solutions. 

This  theorem  guarantees  a  unique  prime  (where  the 
primes  are  the  surface  atoms)  factorization  of  lattice  equa¬ 
tional  systems  and  their  solutions.  Notice  that  each  Good¬ 
win  projection  of  a  system  1-  results  in  a,  system  whose 
surface  lattice  is  {T,l}.  We  “know”  how  to  solve  these 
by  theorem  IV. 2.  Hence  factoring  followed  by  finding  all 
of  the  (successful)  elimination  sequences  produces  all  of 
the  surface  solutions.  In  particular,  all  four  of  the  well- 
founded  surface  solutions  of  S9  can  be  produced  by  tak¬ 
ing  the  Goodwin  projections  with  respect  to  the  surface 
atoms,  .4  and  X  solving  the  two  resulting  .systems,  ''mul¬ 
tiplying”  the  resulting  solutions  by  .4  and  ,4  respectively, 
and  adding”  the  results.  Since  each  projected  system  has 
two  solutions,  the  overall  system  has  four.  Elsewhere  [Be 
nanav  et  a /.]  we  have  shown  the  problem  of  solving  gen 
eral  lattice  equational  systems  to  be  NP  hard  in  the  size 


of  the  „/s..e;n.  We  set  nov.  now  this  might  arise:  Using 
the  algebra v  .•(•suits  that  we  have  cited  can  produce  solu¬ 
tions  at  the  cost  of  composing  two  potentially  exponential 
processes,  the  projection  b  r  surface  atoms  and  the  finding 
of  minimization  and  substitution  sequences  that  actually 
produce  a  well-founded  solution. 

V.  Conclusions 

In  the  foregoing  we  have  introduced  a  general  model  of 
truth  maintenance  couched  in  a  lattice-theoretic  frame¬ 
work.  All  of  the  truth  maintenance  systems  familiar  to 
us  in  the  literature  can  be  construed  as  solving  systems  of 
lattice  equations.  Indeed,  those  systems  can  be  properly 
embedded  in  our  lattice-theoretic  formalism.  We  intro¬ 
duced  the  fundamental  transformations  of  substitution  and 
minimization  and  showed  how  they  could  be  used  to  pro¬ 
duce  solutions  of  even  lattice  equational  systems.  We  have 
cited  a  number  of  theoretical  results  about  the  algebraic 
structure  of  truth  maintenance  systems  and  interpreted 
these  results  ir  terms  of  concrete  examples.  We  have  used 
these  examples  to  illustrate  how  a  given  formal  algebraic 
result  either  reveals  some  intrinsic  c,  .fficuLy,  or  how  it  can 
be  used  to  computational  advantage.  Finally,  we  sketched 
how  our  separation  results  can  be  used  to  generate  all  of 
the  •  urface  solutions  of  an  arbitrary  lattice  equational  sys¬ 
tem.  The  principal  technical  contributions  of  those  aspects 
of  our  work  on  lattice-theoretic  truth  maintenance  that  we 
have  presented  in  this  paper  are: 

•  The  formalization  of  truth  maintenance  in  a  way 
that  properly  includes  nonmonotonic  justifications 
and  assumption- based  justifications 

•  The  presentation  of  a  noint  of  view  from  which  one 
can  algebraically  analyze  the  structure  of  truth  main¬ 
tenance  problems  and  the  construction  of  solutions  to 
those  problems 

•  The  motivation  of  the  algebraic  results  with  compu¬ 
tational  and  phenomenological  interpretations 

Though  not  the  topic  of  this  paper,  it  is  also  from  this  same 
lattice-*!..;  t:e  point  of  view  that  we  have  carried  out  the 
analysis  01  .he  abstract  and  computational  complexity  of 
truth  maintenance. 
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Abstract 

We  give  a  formal  semantics  to  truth  maintenance  by 
offering  here  a  mathematical  logic — equipped  with  an 
underlying  model  theory — that  is  used  to  character¬ 
ize  quite  precisely  some  well  known  models  of  truth 
maintenance.  Our  usage  of  ‘precise’  is  doubly  in¬ 
tended  in  that  we  give  meaning  to  truth  maintenance 
in  terms  of  a  formal  logic,  and  that  each  character¬ 
izing  logic  corresponds  to  a  particular  truth  mainte¬ 
nance  system  and  vice  versa. 

I.  Introduction 

The  history  of  mathematics  is  replete  with  formal  systems 
consisting  of  symbols  and  operational  transformations  on 
same,  wherein  the  utility  of  the  formal  systems  had  been 
explored  and  exploited  long  before  completely  satisfactory 
(mathematical)  semantical  accounts  of  the  systems  were 
provided.  Examples  of  this  are  the  integral  calculus,  predi¬ 
cate  calculus  and  the  lambda  calculus  whose  corresponding 
mathematical  semantics  are  respectively  Lebesgue  mea¬ 
sure,  Tarskian  semantics,  and  Scott  semantics.  Truth 
maintenance  systems  are  a  more  recent  instance  of  this 
phenomenon  where  operational  utility  has  been  realized 
in  advance  of  mathematical  justification.  Although  there 
are  various  logical  accounts  of  nonmonotonic  reasoning 
(see  [Perlis,  1984]  for  a  complete  survey)  that  have  been 
equipped  with  suitable  formal  semantics  (including  our 
own  attempt  in  [Drown,  1985]),  none  of  these  accounts  cap¬ 
tures  truth  maintenance  with  satisfactory  precision.  Th 
questions  we  propose  to  answer  here  are: 

1.  With  respect  to  what  logic  might  the  formulae  labeled 
as  “IN”  by  truth  maintenance  systems  be  counted  as 
theorems? 

2.  What  exactly  is  the  logical  status  of  formulae  labeled 
as  “OUT”?  ' 

In  the  following  we  will  develop  logics  and  associated  se¬ 
mantics  that  correspond  to  the  the  TMS’s  of  Doyle  [Doyle. 
1979a,  Doyle,  1979b.  Doyle,  1978]  and  Goodwin  [Goodwin, 
1982,  Goodwin,  19S5,  Goodv/in  1984,  Goodwin,  1987], 
the  ATMS  of  de  Ivleer  [do  Kleer,  198Ga,  de  Kleer  1984, 
de  Kleer,  19SGb,  de  Kleer,  19SGc],  and  our  own  AN  RMS 
[Brown  et  til,  1987,  Benanav  ei  al. ,  Forthcoming,  Gaucas 
and  Brown,  1987].  We  will  first,  provide  a  logic  and  model 
theory  for  the  TMS’s  of  Doyle  and  Goodwill.  We  will  then 


reduce  the  logical  characterization  of  other  TMS’s  to  the 
Doyle/Goodwin  case.  As  we  pointed  out  earlier,  our  prin¬ 
ciple  task  is  to  make  logical  sense  of  “IN”  and  “OUT”. 
We  do  this  this  by  formalizing  the  propositional  attitude 
of  belief  for  the  propositions  of  an  underlying  logical  lan¬ 
guage.  We  call  these  beliefs  justified  in  that  they  are  the 
consequents  of  syntactically  well-formed  arguments.  We 
distinguish  them  from  the  true  beliefs  [Gettier,  1967,  Grif¬ 
fiths,  1967,  Malcolm,  1967,  Prichard,  1967]  ordinarily  of 
interest  to  philosophers  in  that  we  are  disinterested  in  the 
logical  soundness  of  the  arguments  in  question  (just  as  is 
the  case  for  a  truth  maintenance  system).  The  logics  we 
construct  will  give  a  syntactic  characterization  to  justi¬ 
fications  and  beliefs.  When  consistent,  these  logics  will 
count  propositions  as  believed  just  in  case  the  correspond¬ 
ing  truth  maintenance  system  would  have  labeled  them 
“IN”. 

II.  Nonmonotonic  Truth 
Maintenance 

A.  Syntax 

Let  £  be  a  first-order  language  equipped  with  functions, 
predicates,  connectives,  quantifiers,  and  perhaps  even 
modalities.*  £  has  the  usual  formation  rules  for  first-order 
languages.  The  details  of  £  will  not  concern  us  very  much 
here.  p,q,r  (possibly  subscripted)  range  over  formulae  of 
£.  We  define  the  language  C\  as  follows: 

1.  Ifp  is  aformula  of  £,  then  B[p]  and  ~‘B\p]  are  formulae 
of  £j.  Formulae  of  this  form  are  called  elementary 
(respectively  positive  and  negative)  beliefs  with  core 
p.  The  set  of  beliefs  (positive  and  negative)  will  be 
denoted  £[£].  Similarly,  if  .4  C  £,  £?[.4]  is  the  set  of 
positive  and  negative  beliefs  whose  cores  are  in  .4. 

2.  If  />//i . </,„,?•  i . 7-„  are  formulae  of  £,  then 

J\p ]•  . <7m],  J\p\<l\,- ■  •  it/mkt . ?‘n]'  ”'><1 

J\j>\\r\ . r„],  are  all  in  £].  Formulae  of  the  latter 

form  arc  called  justifications,  p  is  the  consequent  of  the 
justifications,  while  t p  ,.  . .  //„,  and  »q  ,.  .  .  ,»•„  are  respec¬ 
tively  the  monotonic  and  nonmonotonic  antecedents 


* We  will  freely  use  the  connectives  and  quantifiers  of  tirst-order 
logic  as  part  of  our  ordinary  mathematical  discourse.  Since  no  for¬ 
mulae  of  C  are  ever  act ually  ment ioned,  we  t rust  t bat  this  will  cause 
no  confusion. 
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of  the  justifications  where  they  are  mentioned,  j  will 
be  a  variable  that  ranges  over  justifications.  Justifica¬ 
tions  without  nonmonotonic  antecedents  are  termed 
monotonic  while  those  with  are  termed  nonmono¬ 
tonic.  a(j),  and  a(j),  and  k(j )  are  respectively  the 
monotonic,  and  nonmonotic  antecedents,  and  the  con¬ 
sequent  of  the  justification,  j. 

3.  No  other  formulae  are  in  £j. 

Since  £  (and  consequently  £j)  is  presumed  to  be  recur¬ 
sive,  we  may  also  presume  the  existence  of  a  total  lexical 
ordering  on  the  formulae  of  C\. 

A  TMS  theory ,  7,  is  any  finite  set  of  justifications. 
A  TMS  theory  having  nonmonotonic  justifications  is  non¬ 
monotonic.  Let  A  be  the  subset  of  £  containing  exactly 
those  formulae  that  appear  as  antecedents  or  consequents 
of  justifications  in  7.  Let  Y  be  the  set 

{T  U  X\X  C  B[A]  A  (Vp  G  A)B\p]  G  X  V  -B[p]  G  A"} 

Let  X(S)  be  a  subset  of  L\  such  that 

1.  5Cf(5); 

2.  S[p]  is  in  T(S)  whenever  there  is  a  justification  in  S  of 
which  p  is  the  consequent  ,  for  each  of  whose  inonotonic 
antecedents,  q ,  B[(f\  is  in  S ,  and  for  each  of  whose 
nonmonotonic  antecedents,  r,  -e£?[r]  in  <5; 

3.  — > C [/>]  is  in  X(S)  if  for  every  justification  in  S  of  which 
p  is  the  consequent,  then  for  at  least  one  of  its  mono¬ 
tonic  antecedents,  q,  -'$[<?]  is  in  S,  or  for  at  least  one 
of  its  nonmonotonic  antecedents,  r,  £?[r]  in  <S; 

4.  no  other  formulae  are  in  X(S). 

A  justification  closure  operator,  Cj,  for  a  TMS  theory,  T, 
is  a  function  on  elements  of  Y  defined  by 

Ct(S)  =  7(7  US). 

We  are  typically  interested  in  the  least  fixed  points  of  Cj. 
We  will  abuse  our  notation  by  occasionally  referring  to  a 
particular  fixed  point  as  Cj.  Also,  we  will  refer  to  a  fixed 
point  of  a  theory.  T.  meaning  a  fixed  point  of  the  justifica¬ 
tion  closure  operator  for  that  theory.  A  set  of  formulae,  S, 
will  be  termed  inconsistent  if  it  contains  both  a  belief,  B\j>\. 
and  its  negation,  ->S[p].  Notice  that  it  will  typically  be  the 
case  that  a  fixed  point  will  be  indifferent  to  most  beliefs 
( i.  e.  it  will  contain  neither  U[p]  nor  ->Z?[p]).  In  such  cases 
we  are  free  to  add  either  a  positive  (exclusive)  or  negative 
belief  and  still  have  a  consistent  fixed  point,  of  T,  though 
no  longer  least.  Since  truth  maintenance  systems  by  and 
large  profess  di  belief  in  any  formula  (of  £)  for  which  there 
is  no  argument,  we  will  augment  a  least,  fixed  point  with 
any  negative  belief  for  which  there  is  no  corresponding  pos¬ 
itive  belief  in  the  fixed  point.  Such  an  augmentation  is  a 
justification  completion  of  a  TMS  theory. 

For  fixed  points  to  be  interesting  they  must  exist,: 

Proposition  II.  1  Every  TMS  theory  has  a  least  fixed 
point. 


Proof.  Let  T  be  a  TMS  theory.  Clearly  T  U  B[A]  is  in  Y 
and  is  a  finite  fixed  point  of  Cp.  T  U  B[A]  being  finite,  it 
contains  some  least  subset  S  that  is  an  element  of  Y  and  is 
a  fixed  point  of  Cj.  Hence  S  is  a  least  fixed  point  of  Cj.O 
Since  TMS  labelings  are  obviously  nonmonotonic  in 
that  the  addition  of  new  justifications  can  cause  formulae 
formerly  labeled  as  “IN”  to  be  relabeled  “OUT”,  the  cor¬ 
responding  logical  theory  ought  to  have  this  property  as 
well: 

Proposition  II. 2  There  exist  TMS  theories  7j  and  72 
such  that  there  is  no  least  fixed  point  of  Ty  £17^  containing 
any  least  fixed  point  of  Tj . 

Proof.  Consider  the  TMS  theories  {£7  [p|  |p] }  and 
{ J[p]}.  The  first  theory  has  the  least  fixed  point 
■{ £7 [p |  |p] ,  B[p],  ->B[p]},  and  the  second  has  the  least  fixed 
point  {£7[p],B[p]},  while  their  union  has  the  least  fixed 
point  {£7[p],£7[p||p],H[p]}.  Since  all  of  these  least  fixed 
points  are  unique  for  their  respective  theories,  the  propo¬ 
sition  follows.  □ 

A  partial  order,  a  subset  of  T>2,  is  graded  if  there  is  a 
function  from  V  into  the' non-negative  integers  such  that 

1 .  every  cl  G  V  has  a  grade; 

2.  the  grade  of  d  G  T>  is  0  whenever  there  is  no  d'  G  T> 
such  that  d1  is  less  than  d  in  the  partial  order; 

3.  the  grade  of  each  d  G  V  is  larger  than  that  of  every 
d1  G  T>  smaller  than  d  in  the  partial  order. 

The  grading,  6,  for  a  partial  order  over  the  domain,  V  will 
be  termed  standard  if  it  satisfies  6(d)  =  1  +mvx{6(dr)\d'  < 
d  A  ->(3 cl")d'  <  d"  <  d}.  A  unique  standard  grading  al¬ 
ways  exists  for  a  graded  partial  order.  Henceforth  we  will 
assume  ‘standard’  whenever  we  mention  ‘grading’.  A  fixed 
point,  S,  of  a  theory,  7,  is  well-founded  if  there  is  a  graded 
partial  order,  <5,  on  positive  beliefs  (of  S)  such  that  for 
every  positive  belief,  B\p ]  G  S,  there  is  a  justification, 
J\p\qi,  ■  •  .,<Zm|ri,. .  .,rn],  such  that  B[qi], .  ■  .  ,£>[<?m]  G  S, 
-iS[r,], . .. ,  -s£[rn]  G  S,  and  qu....  qm  <s  P- 

We  complete  this  section  with  some  additional  proof- 
theoretic  results  for  TMS  theories  that  will  serve  us  later 
in  our  investigation. 

Proposition  II. 3  Let  S  be  a  well-founded  least  fixed  point 
of  the  justification  closure  operator,  Cj  of  the  TMS  theory 
7.  Let  <5  be  a  graded  partial  order  for  S.  There  exists  a. 
least  partial  order  contained  in  <5  under  which  S  remains 
well-founded. 

Proof.  Suppose  S,  7  and  <5  are  as  in  the  statement,  of  the 
proposition.  By  the  definition  of  well-founded,  for  every 
B[p]  G  S  there  is  a  justification,  j  G  7  such  that  *(./)  =  p 
and  for  each  q  G  0(7),  B[q]  <$  p  and  B[q]  G  S.  Our 
first  task  is  to  extend  the  partial  order,  <s.  to  include 
justifications  in  7.  Let  <1  be  the  least  transitive  partial 
ordering  on  beliefs  and  justifications  such  that:  q  <\  j 
if  and  only  if  j  G  7,  H[n(j)j  G  5.  q  G  n(j)  <s  s'(j). 
(Vr  G  <>(j))B[r}  G  5  A  r  s(j),  (Vr  G  «(j))">fl[r]  6  S. 
j  <1  p  if  and  only  if  j  G  7,  p  =  i<(j).  U(p]  G  (Vij  G 
0(7 )  )£>[</]  G  5Ary  <.s  p,  and  (Vr  G  0(7  ))—•£*[/']  6  S.  Clearly 
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if  <;5  is  graded,  so  is  <1.  Let  <2  be  a  suborder  of  <1 
such  that  p  <2  j  if  and  only  if  p  <1  j,  and  j  <2  p  if 
and  only  if  j  is  the  lexically  smallest  justification  of  least 
grade  (in  <1)  having  p  as  a  consequent.  Clearly  any  given 
p  is  immediately  preceded  (in  the  <2  order)  by  at  most 
one  justification.  Finally,  x  <3  y  if  and  only  if  x  <2  y 
and  y  <2  p  where  B\p]  6  S.  <3  is  a  graded  partial  order 
having  the  property  that  it  is  a  suborder  of  <5  and  every 
positive  belief  of  S  not  in  T  is  preceded  (in  <3)  by  exactly 
one  justification  from  T .  <3  is  clearly  minimal. □ 

A  TMS  theory,  T,  together  with  C-r  is  a  logic  of  jus¬ 
tified  belief.  As  mentioned  earlier,  we  speak  of  justified 
beliefs  rather  than  true  beliefs.  For  a  belief  to  be  justified 
we  merely  require  it  to  be  grounded  in  a  well-founded  argu¬ 
ment  (the  partial  order  <7-  together  with  a  suitable  set  of 
justifications).  Thus  it  is  possible  for  both  B\p\  and  B[-^p\ 
to  be  justified  in  a  consistent  TMS  theory  even  though 
this  pair  of  beliefs  would  not  be  held  by  a  rational  agent. 
This  contrasts  with  logics  of  true  belief  wherein  the  cores 
of  positive  (negative)  beliefs  are  typically  (non- (theorems 
is  some  underlying  ^-theory.  In  general  we  shall  be  in¬ 
terested  in  least  fixed  points  of  the  justification  operators 
of  particular  TMS  theories,  where  those  fixed  points  are 
well-founded  under  the  associated  partial  order.  Consider 
the  TMS  theories 

7  =  {J\P\\P\}, 

72  =  {J\p\q],J[q\p}), 

Ts  =  {J[pMJ[q\\p}}, 

74  -  {J\pM,J[p\q],J[q\pYJ[qM}. 

7j  has  a  single  least  fixed  point,  7j  U  {£?[/>],  — >Z5[p] },  and  it 
is  inconsistent.  7?.  has  two  consistent  least  fixed  points, 
T2  U  {-i£![p],  -'£%]}  and  T2  U  {B\p\,  £%]}  of  which  the 
first  is  well-founded.  T3  has  two  least  fixed  points,  T3  U 
{£>[p],  “’£%]}  and  T3  U  {£*[<7],  — ■  £? [p] } ,  and  each  of  them  is 
consistent  and  well-founded.  T4  has  a  single  least  fixed 
point,  T4  U  {J[q\p\,  C7"  [<?  1 1  <7] ,  £?[<?],  H[p]}  and  it  is  consistent 
and  not  well-founded. 

Proposition  II. 4  A  justification  completion  of  a  mono¬ 
tonic.  TMS  theory  is  consistent. 

Proof.  That  there  is  a  justification  completion  has  already 
been  guaranteed.  Without  loss  of  generality,  we  may  re¬ 
place  ‘completion’  with  ‘closure’  in  the  statement  of  the 
proposition.  Suppose  that  there  are  elementary  beliefs  B{}>\ 
and  -'£>[/->]  in  the  justification  closure  of  the  monotonic  the 
ory,  7".  Suppose  further  that  all  of  the  justifications  in  T 
having  p  as  a  consequent  have  no  antecedents.  The  defi¬ 
nition  of  J-  guarantees  that  no  least,  fixed  point  of  Cj  can 
contain  both  B [p]  and  —<£?[/>] .  Hence  for  every  contradic 
tory  pair  { ,  —'/?[/>] }  there  must  be  at  least  one  justifi 
cation  having  p  as  a  consequent,  and  a  non-empty  set  of 
antecedents.  Furthermore,  the  definition  of  J~  also  guar 
antees  that  for  each  justification  with  consequent  p  there 
is  an  antecedent  q  such  that  Bf>]  and  -if?[p]  are  in  the 
justification  closure  of  T.  It  is  easily  verified  that  if  we 


remove  the  negative  belief  of  every  contradictory  pair  in 
the  justification  closure,  the  resulting  (consistent)  set  of 
beliefs  is  still  a  fixed  point  of  C7-,  This  last  contradicts  the 
claim  that  we  started  with  a  justification  closure,  and  the 
proposition  follows.  □ 

We  will  say  that  q  is  directly  connected  to  p  in  a  the¬ 
ory  T  if  there  is  a  justification  in  T  of  which  q  is  an  an¬ 
tecedent  and  p  is  the  consequent.  The  ‘connected  to’  rela¬ 
tion  is  then  the  the  least  transitive  relation  containing  the 
‘directly  connected  to’  relation,  p  and  q  are  strongly  con¬ 
nected  in  7”  if  p  is  connected  to  q  in  T  and  q  is  connected 
to  p  in  T.  Finally,  a  strongly  connected  component  of  T  is 
a  subset  of  C  such  that  every  pair  of  elements  in  the  subset 
is  strongly  connected.  A  maximal  strongly  connected  com¬ 
ponent  in  T  is  one  that  is  contained  in  no  larger  strongly 
connected  component.  Henceforth  we  will  only  consider 
maximal  strongly  connected  components. 

Proposition  II. 5  Every  monotonic  TMS  theory  has  a 
consistent,  well-founded  justification  completion. 

Proof.  That  a  TMS  theory,  T,  has  a  consistent  justifica¬ 
tion  completion  is  guaranteed  by  the  previous  proposition. 
Again  we  may  restrict  our  attention  to  the  justification  clo¬ 
sure.  I11  order  for  the  justification  closure  of  such  a  theory 
not  to  be  well-founded,  one  can  readily  verify  that  there 
must  exist  a  maximal  strongly  connected  component,  S , 
of  T  such  that  for  every  pg5: 

1.  Z?[p]  is  in  the  justification  closure, 

2.  every  justification  in  T  has  at  least  one  antecedent  in 

S  if  it  has  consequent  p  and  for  all  of  its  antecedents, 

r,  B[r]  is  in  the  justification  closure. 

Now  observe  that  if  every  positive  belief  in  the  justification 
closure  with  core  in  S  is  replaced  by  the  corresponding 
negative  belief,  the  resulting  set  of  formulae  will  still  be  a 
least  fixed  point  of  C-r .  If  we  apply  this  replacement  recipe 
to  every  S  fitting  the  description  that  we  gave  above,  we 
will  be  left  with  a  well-founded,  least  fixed  point  of  Cj.O 

Proposition  II. 6  Every  m.onotomc  TMS  theory  has  a 
unique  consistent  well-founded  justification  completion. 

Proof.  That  a  monotonic  TMS  theory,  T,  has  a  consistent, 
well-founded  justification  completion  is  already  established 
by  the  propositions  above,  Our  aim  here  is  to  establish 
uniqueness.  Every  theory  T  can  be  uniquely  partitioned 
into  strongly  connected  components.  Moreover,  we  will 
say  that  one  such  component  is  below  another  just  in  case 
there  is  a  p  in  the  first  connected  to  some  q  in  the  second. 
This  relation  obviously  induces  a  unique  partial  order  on 
strongly  connected  components  in  T.  Let  T  be  the  the¬ 
ory  having  the  strongly  connected  component  of  smallest, 
size  such  that  T  has  two  distinct  well-founded  justifica¬ 
tion  completions,  5,  and  S2.  Clearly  some  strongly  con 
nected  component,  Z ,  of  least  height  in  T  must  be  such 
that,  the  subsets  of  beliefs  from  each  of  the  completions, 
Si  and  S 2,  whose  cows  are  exactly  the  elements  of  Z  will 
also  be  distinct  1  .0  may  presume  that  Z  has  nothing  be¬ 
low  it,  for  if  it  did.  there  would  );<  a  smaller  theor  T' 


that  also  had  distinct  well-founded  justification  comple¬ 
tions.  This  new  theory  would  be  obtained  from  T  by  first 
deleting  any  justifications  connecting  strongly  connected 
components  below  Z  or  justifications  whose  antecedents 
are  in  strongly  connected  components  below  Z .  We  would 
add  a  justification  with  consequent  p  and  having  no  an¬ 
tecedents  whenever  there  was  a  justification,  j,  in  T  such 
that  (V<j  G  a(j))B[q]  G  iSj.  There  exists  a  p  £  Z  such 
that  B\p)  is  in  one  of  those  justification  completions  and 
j  =  J\p ]  >s  in  T •  But  this  means  that  the  theory  T  less 
any  justification  j'  ^  j  with  consequent  p  must  also  have 
the  same  two  distinct  justifications  as  T.  (Note  that  at 
least  one  such  j'  exists  in  order  for  T  to  have  a  single 
strongly  connected  component  of  more  than  one  element.) 
This  contradicts  the  assumption  that  T  has  the  stre  gly 
connected  component  of  smallest  size  while  also  having  two 
distinct  well-founded  justification  completions. □ 

The  justification  completion  is  meant  to  capture  the 
constraint  propagation  processes  implicit  in  the  truth 
maintenance  systems  of  Doyle  and  Goodwin.  Justification 
completions  of  TMS  theories,  in  contrast  to  the  deductive 
closures  of  ^-theories,  are  meant,  to  capture  that  which  has 
been  proven  in  contrast  to  that  which  is  provable.  The  cor¬ 
respondence  between  the  syntactic  notion  of  justification 
given  above  and  the  homonymous  notion  in  the  TMS’s  of 
Doyle  and  Goodwin  will  be  apparent  to  readers  familiar 
with  those  investigators’  systems.  Our  aim  here  is  for  the 
justifications  in  T,  having  no  antecedents,  to  correspond 
to  the  premisses  of  a  typical  truth  maintenance  system. 
The  elementary  positive  and  negative  beliefs  in  the  jus¬ 
tification  completion  are  meant  to  correspond  to  the  for¬ 
mulae  labeled  (respectively)  “IN”  and  “OUT”  by  a  TMS. 
Readers  can  readily  verify  that  the  TMS’s  of  Doyle  and 
Goodwin  would  label  a  node  as  “IN”  with  respect  to  a 
set  of  justifications  only  if  the  proposition  associated  with 
that  node  were  the  core  of  a  positive  belief  in  the  justifica¬ 
tion  completion  of  the  corresponding  TMS  theory.  Having 
given  a  syntactic  characterization  to  truth  maintenance  by 
defining  a  formal  logic  of  justified  belief,  we  turn  now  to 
supplying  :  suitable  semantics  for  that  logic. 

B.  Semantics 

In  this  section  we  will  equip  TMS  logics  with  a  possible 
world  semantics  [Bradley  and  Swartz,  19S1,  Chellas,  1980. 
Hughes  and  Crosswell,  19S4,  Hughes  and  C'resswell.  1 9GS] . 
This  semantics  is  slightly  unusual  in  that  there  are  two  ac¬ 
cessibility  relations,  one  to  give  meaning  to  justifications 
(or  validity  in  the  terminology  of  Goodwin)  and  oik-  to 
give  meaning  to  well  foundedness:  furthermore,  the  acces¬ 
sibility  relation  corresponding  to  justifications  relates  ar¬ 
bitrary  (finite,  ordered  tuples  of  worlds  in  contrast  to  the 
usual  ordered  pairs  of  worlds.  In  order  to  capture  the  non 
monotonicity  of  TMS  theories,  we  base  our  semantics  on 
the  idea  of  minimal  models,  a  notion  introduced  by  Me 
Curt  by  [McCarthy,  1980]  and  Davis  [Davis,  19S0],  further 
pursued  by  Bos.su  and  Siegel  [Bossu  and  Siegel,  19S5],  and 
ultimately  explored  and  exploited  by  Sholiam  [Slioham. 


198G].  Finally,  it  will  develop  that  the  computation  car¬ 
ried  out  by  a  truth  maintenance  system  will  correspond 
to  the  construction  of  an  appropriate  model  should  such  a 
structure  exist. 

An  interpretation,  M,  is  a  structure  (IT,  7r ,  p,  x)  We 
will  subscript  the  various  elements  of  a  structure  as  re¬ 
quited  to  avoid  ambiguity  of  reference.  W  =  G  C } 

is  a  set  of  moments  of  truth.  u,v,iu  (possibly  subscripted) 
will  denote  moments.  X  C  W2.  n  is  of  type  7r:  £  —*  2VV  . 
M  satisfies  p  G  £  at  moment  w,  denoted  M ,  w  f=  p,  just 
in  case  (ir(p))(w)  =  1.  We  impose  an  additional  restriction 
on  7r  that  wp  X  w  =?■  (it (p))(u>)  =  (ir(p))(wp).  p  is  of  type 
/<:(A  -  m)  -  \Jm,n2W*W"*W'-  In  particular, 

p{Jlp\qi,...,qm\ru...,r„))  e2w*w"'*w" . 

M  satisfies  J\p\qi,  ■  ■  ■  ,qm |t*i,  . . .  ,r„],  denoted 

M  \=  J'[p|<?i----,9m|ri,...,r„], 

just  in  case 

1-  (p(J[p|tfi.---,4mkn-  •  w7l, 

...,wqm,wrx . uvj  =  1; 

2.  and  if 

(a)  (p(J[p|?i,.  •  •  •••,?•„]))(«,  iq, 

•  ..,Vm,U>l,...,U>„)  =  1, 

(b)  M,  «i  ,M,um  [=  qm, 

M,vt  ft  ru. . .  ,M,vm  rn, 

then  M ,  u  \=  p. 

M  |=  £?[p]  just  in  case  M,  wp  |=  p.  An  interpretation,  M, 
is  said  to  be  a  model  of  a  set  of  formulae  S  C  C\,  denoted 
M  |=  5,  if  and  only  if  it  satisfies  every  formula  in  S. 

Let  <1  be  the  least  irrefiexivc,  asymmetric,  transitive 
relation  on  models  of  T,  a  TMS  theory,  such  that  if  Mi 
and  M2  are  models  of  T  then  the  following  criteria  are 
met; 

1.  if  {Vj)M  1  j  =>  M2  [=  j  1  then  M\  <  M2\ 

2.  if  Mi  and  M 2  are  unordered  with  respect  to  the  pre¬ 
vious'  criterion  and  (Vp)Mi  |=  C[p]  =>  Mi  [=  £> [p] , 
then  M\  <  M2', 

3.  if  Mi  and  M2  are  unordered  with  respect  to  the  pre¬ 
vious  criterion  and 

((Vp)M,  [=  B[p]  =>  ((3. j)p  =  n(j) 

A  (  iq  G  n(j))Mi  [=  £>by]  A  q  X.w,  P 
A  (Vr  G  <T(  > ) ).Vf  1  £>[/•])) 

A 

(^[Vp)M2  \=  B[p]  =>  ((3>)p  =  M./l 

A  (Viy  G  O (j))Mi  \=  £%]  A  ;  X,M,  p 
A  ( Vr*  G  o(  j )  )M-2  ¥  B[r])) 

then  M  1  5?  Mi'. 

4.  if  Mi  and  Mi  are  unovdered  with  respect  to  the  pre¬ 
vious  criteria  and  X.vg  is  a  graded  partial  order  while 
X.vr2  is  not,  then  Mj  <1  M>: 


5.  if  All  and  Ad 2  are  unordered  with  respect  to  the  pre 
vious  criteria  and  is  a  subrelation  of  -<m2 >  then 

Adi  <!  Ad2; 

6.  if  Adi  and  Ad2  are  unordered  with  respect  to 
the  previous  criteria  and  (Vp, w)(tt Mi(p))(w)  =  1  => 
(5rAt2(p))(«')  =  1  or 

(Vj,u,u  1,. . .  ,vm,wu  . .  ,,wn) 

(PM,(j))(uiv I,-  - •  ,«m,u»i,-  •  =  1 

=>  (Pm20'))(«Ai,-  -  Am, «»i, ...,t«n)  =  1 

then  Adi  5?  Ad2- 

A  TM5  model  of  5  is  a  minimal  model  in  the  partial 
order  <1 .  Intuitively,  a  TMS  model  says  that 

•  as  few  justifications  as  possible  are  satisfied, 

•  as  few  elementary  positive  beliefs  as  possible  are  sat¬ 
isfied, 

•  as  many  consequent  positive  beliefs  as  possible  are 
ordered  with  respect  to  the  inonotonic  antecedents  of 
some  justification, 

•  at  most  one  argument  justifying  any  given  elementary 
belief  is  offered  in  an  “explanation”,  X. 

Theorem  II. 1  A  TMS  theory,  T ,  has  a  consistent  (well- 
founded)  justification  completion,  S,  if  and  only  if  it  has 
a  TMS  model,  Ad,  (with  graded  partial  order,  <m)  such 
that  (Vx  €  £i)Ad  |=  x  x  £  S. 

Proof. 

=>:  Suppose  T  and  S  arc  as  in  the  statement  of  the  theo¬ 
rem.  By  proposition  II. 3  we  may  assume  without  loss 
of  generality  that  if  we  have  a  graded  partial  order 
for  5,  it  is  minimal.  We  construct  a  TMS  model  as 
follows: 

,  ,  ,,,  ,  f  1  if  w  =  wp  and  S[p]  €  S, 

(nM(p)}w)=  |  0othprwiscP 

Ifi  =  J(p\q\,..-,<lmV\,...,ru]  then 

(PAf(;))(*M’i,  •  •  •  ,vm,u>u. .  .,w„)  = 

{1  if  ;  €  T  and 
u  =  wp, 

”l  =  U>„  =  Wqm  , 

W 1  =  11 V,  ,  •  .  •  ,  u>l  =  tl’r„  , 

0  otherwise. 

If  there  is  a  graded  partial  order,  <5,  tlier  wp  X.vi 
Wg  <=>  p  <s  <?•  The  fact  that  Ad  is  a  mini  nal  TMS 
model  follows  immediately. 

•t=:  Let  M  be  a  minimal  model  of  T  such  that  (  £ 

C\  )Ad  \=  .t  <=>  .r  £  S.  The  consistency  of  S  follows 
immediately  from  the  existence  of  Ad  and  the  defini¬ 
tion  of  satisfiability.  Any  graded  partial  order  X.v< 
on  W/4  induces  a  similar  partial  order  011  the  positive 
beliefs  of  £.□ 


Corollary  II. 1  If  Mi  and  Ad2  are  distinct  TMS  models 
of  a  justification  completion  of  a  TMS  theory,  they  differ 
only  in  their  ordering  relations. 

Proof.  It  is  immediate  from  the  last  theorem  that  if  S 
is  the  justification  completion  in  question,  that  (Vx  £ 
£i)Ad],Ad2  (=  x  x  £  S.  Since  Ad*  and  Ad  are  min¬ 
imal  tcm,  =  if m2  and  Pap  =  PAS'  Thus,  the  two  models 
can  only  differ  on  their  partial  orders. O 

III.  Assumption-Based  Truth 
Maintenance 

The  TMS’s  of  Doyle  and  Goodwin  are  termed  justification- 
based.  An  alternative  model  of  truth  maintenance  is  the 
assumption-based  approach  introduced  by  de  Kleer.  We 
address  assumption-based  truth  maintenance  by  reduction 
to  an  equivalent  justification-based  model. 

A.  ATMS 

Let  A  =  { A 1 , .  . .  ,.4n]  be  a  finite  set  of  assumptions.  A 
label  is  any  subset  of  the  set  of  assumptions.  The  language, 
C,  will  be  as  before.  The  language  £2  is  L\  excluding 
nonmonotonic  justifications.  An  ATMS  theory  is  a  set, 

0  =  {T\\T\  is  a  monotonic  TMS  theory  and 
ADYDX  =>TxCTy}. 

The  mutual  justification  completion  of  0  is  the  set 

{<Sy |5,v  is  the  justification  completion  of  Tx  £  0). 

0  together  with  the  relevant  justification  closure  operators 
constitutes  a  logic  of  justified  belief.  Each  of  the  consistent 
justification  completions,  S,\,  exists  since  the  correspond¬ 
ing  Ty  is  over  a  monotonic  TMS  theory.  Interpreting  a  set 
of  assumptions  as  an  “environment”,  in  de  Kleer’s  sense  of 
the  word,  an  element  of  a  mutual  justification  completion 
tells  us  what  formulae  are  “IN”  the  corresponding  envi¬ 
ronment.  The  ATMS  model  of  an  ATMS  theory  then  is 
merely  the  set  of  TMS  models: 

{ M  \  | M  \  is  a  TMS  model  of  Tx  £  0}. 

Immediate  from  the  various  propositions  and  theorem  II.  1 
we  have 

Corollary  III.  1  An  ATMS  theory  has  a  unique  consis¬ 
tent  mutual  justification  completion.  Moreover,  the  theory 
has  a  unique  ATMS  model  such  that  (Vx  £  £2)Adx  1=  ■>' 
r  £  S\ . 

To  see  how  an  ATMS  theory  arises,  consider  the  set 
of  nodes  and  justifications  from  de  Kleer  [de  Kleer,  19SGa, 
pages  150  151]: 

7,+,=  .  =  <*  +  !/=  MM- 2?},  {27  C.  D) }), 
yr^.(.r  =  l.{{A.C},{D.E}}.  {...}), 

7j-+!/=i  1  7r  =  l  7;/=o- 
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The  underlying  logical  language,  £,  seems  to  be  the  lan¬ 
guage  of  algebraic  equations.  The  set  of  assumptions  is 
{A,  B,  C,  D ,  E}.  Let  @  be  defined  as  above  with  each  con¬ 
tained  TMS  theory  being  as  small  as  possible  and 


T{A,B) 

=  {J[x  +  y  =  1)}, 

=  7{b,c,d}> 

t{a,C) 

=  {^[*  =  1]}, 

=  T{  D,E)i 

=  { J[y  -  0,  x  +  y 

The  reader  can  readily  verify  that  B[y  =  Oj  will  be  in 
the  justification  completion  of  every  Tx  6  0  such  that 
{A,  B,C,  D,  E}  3  X  and  either  A'  3  {A,B,C},  X  3 
{A,  B,  D,  E],  or  A'  3  {B,C,D,E},  exactly  the  desired 
result.* 

B.  ANRMS 

In  [Brown  et  al.,  1987,  Gaucas  and  Brown,  1987]  we  intro¬ 
duced  a  new  model  of  truth  maintenance  based  on  solv¬ 
ing  equations  over  Boolean  lattices  that  subsumes  both 
TMS  and  ATMS  styles  of  truth  maintenance.  Indeed, 
in  the  cited  references  we  show  how  to  embed  to  the 
Doyle/ Goodwill  and  de  Kleer  styles  of  truth  maintenance 
system  in  ANRMS.  Also,  we  give  an  algebraic  characteriza¬ 
tion  of  the  reduction  of  an  ANRMS  to  a  collection  of  TMS’s 
which  exactly  mirrors  the  reduction  of  an  ANRMS  theory 
(below)  to  a  collection  of  TMS  theories.  The  Assumption- 
based  Nonmonotonic  Reasoning  System  (ANRMS)  has  an 
associated  logic  of  justified  belief  analogous  to  that  we  have 
associated  with  the  ATMS.  Let  B  be  a  boolean  lattice  with 
the  usual  operations  of  meet,  join,  and  complement  (de¬ 
noted  n,  LI  and  "),  distinguished  constants,  top  (T)  and 
bottom  (_L),  and  a  partial  order  (C).  An  ANRMS  theory 
is  a.  set, 

S  =  {TX\TX  is  a  TMS  theory  AT  3  Y  3  X  =>  Tv  C  7>}. 

The  mutual  justification  completions  and  models  of  E  are 
denned  analogously  to  those  for  0  above.  Since  a  model 
for  an  ANRMS  theory  is  just  the  collection  of  models  as¬ 
sociated  with  a  collection  of  TMS  theories,  theorem  II.  1 
generalizes  directly. 

IV.  Conclusions 

We  have  defined  a  collection  of  logical  theories  and  as¬ 
sociated  them  with  various  models  of  truth  maintenance. 
We  have  identified  the  truth  maintenance  concepts  of  pre¬ 
miss,  assumption,  justification,  node,  “IN”  and  “OUT” 
with  certain  syntactic  constructs  in  those  logics.  We  have 
characterized  the  proof  theories  of  those  logics  in  terms  of 
a  justification  closure  '  pern  tor  and  its  fixed  points.  Each 


*  Notice  that  wo  need  take  no  explicit  heed  of  de  K  leer’s  norjood 
mechanism  as  this  is  entirely  an  apparatus  for  avoiding  unnecessary 
computation.  That  is,  it  is  deemed  unnecessary  to  compute  the  jus¬ 
tification  closure  of  a  TMS  theory,  T  y ,  if  for  some  Y  C  A  the  justi¬ 
fication  closure  of  Ty  has  bot  h  $[/>]  and  for  some  p  6  C. 


instance  of  a  given  logic  is  uniquely  identified  with  set  of 
premisses  and  justifications  in  a  corresponding  model  of 
truth  maintenance  (and  vice  versa).  We  have  given  a  se¬ 
mantical  account  of  these  logics  in  terms  of  minimal  mod¬ 
els.  As  it  turns  out,  the  labeling  process  carried  out  by  a 
truth  maintenance  system  corresponds  to  the  construction 
of  a  minimal  (collection  of)  model(s)  (each)  with  a  graded 
partial  order  should  one  (they)  exist. 
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ABSTRACT 

RUM,  the  Reasoning  with  Uncertainty  Module  described  in 
the  previous  paper,  has  been  tested  and  validated  in  a  se¬ 
quence  of  experiments  in  both  naval  and  aerial  situation  as¬ 
sessment  (SA).  The  Durpose  of  these  experiments  is  to  ex¬ 
ercise  and  evaluate  RUM’s  reasoning  capabilities  in  correlat¬ 
ing  sensor  reports  and  tracks,  locating  and  classifying  plat¬ 
forms,  and  identifying  intents  and  threats.  An  example  of 
naval  situation  assessment  is  illustrated. 

The  testbed  environment  for  developing  these  experiments 
has  been  provided  by  LCTTA,  a  symbolic  simulator  imple¬ 
mented  in  Zetalisp  Flavors.  This  simulator  maintains  time- 
varying  situations  in  a  multi-player  antagonistic  game  where 
players  must  make  decisions  in  light  of  uncertain  and  incom¬ 
plete  data.  RUM  has  been  used  to  assist  one  of  the  LOTTA 
players  to  perform  the  SA  task. 


I.  Introduction 

In  the  previous  paper  we  have  described  RUM,  the  Reason¬ 
ing  with  Uncertainty  Module  whose  layeicu  architecture 
reflects  the  typical  structure  of  automated  reasoning  tech¬ 
niques  [Bonissone  1986,87a]. 

In  this  paper  we  will  illustrate  the  naval  situation  assessment 
problem  which  was  used  to  validate  RUM.  This  application 
is  based  on  an  architecture  designed  to  simulate  various  mili¬ 
tary  scenarios  involving  Multi-Sensors/Multi-Targets 
(MS/MT)  and  to  perform  situation  assessment  (SA)  related 
tasks.  The  MS/MT  architecture,  illustrated  in  Figure  1,  is 
composed  of  two  major  blocks:  a  reasoning  system  and  a 
simulation  environment. 

The  first  block  of  the  MS/MT  architecture,  the  reasoning 
system,  is  based  on  RUM  and  has  already  been  described  in 
the  preceding  paper.  The  second  block  of  the  MS/MT  archi¬ 
tecture,  the  simulation  environment,  is  described  in  section 

*  This  is  u  modified  version  ol  the  paper  Using  T-norm  Based  Uncertainty 
Calculi  in  a  \aval  Situation  Assessment  Application  lhai  will  appear  in  the 
Proceedings  of  the  Third  AAAI  Workshop  on  Uncertainly  in  Artificial 
Intelligence.  Seattle,  Washington.  July  1987, 

This  work  was  partially  supported  by  1  he  Defense  Advanced  Research 
Projects  Agency  (DARPA)  under  US AF/ Rome  Air  Development 
Center  contract  F30602-85  C-0033  Views  and  conclusions  contained 
in  this  paper  are  those  of  the  authors  and  should  not  be  interpreted  as 
representing  the  oRicial  opinion  or  policy  of  DARPA  or  the  U  S 
Government. 


2.  In  section  3,  we  provide  some  definitions  of  the  tasks  re¬ 
quired  to  perform  situation  assessment.  The  last  two  sections 
contain  an  analysis  of  the  MS/MT  experiment  and  some  pre¬ 
liminary  conclusions  on  this  work. 
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Figure  1.  Architecture  for  Multi-Sensors/Multi-Targets  (MS/MT) 


2.  An  Object  Based  Simulation  Environment 

The  second  block  of  the  MS/MT  architecture  is  the  simula¬ 
tion  environment.  This  environment  is  centered  around 
LOTTA,  an  object-oriented  symbolic  battle  management 
simulator  that  maintains  time-varyinc  situations  in  a  multi¬ 
player  antagonistic  game  IB  iiwcnr  IV!. /hi.  ae  f.ievr,o« 
ment  environment  baud,  cu  ■'  07 TA  *'>r.s»!tu..v  -  testoed  fo 
validating  '  tec  ■  ques  in  reasoning  with  uncertainty  anu 
for  perfi  ri  g  iniorrnation  fusion  functions  [Sweet  19861. 
The  dcu-'opme.i.  environment  is  composed  of  four  basic 
modules,  the  window  manager,  the  annotation  system. 

The  window  manager  is  a  map-like  window-oriented  user  in¬ 
terface.  It  controls  the  menu  driven  interaction  of  the  hu¬ 
man  player  with  LOTTA  and  handles  multiple  windows  per 
player. 

The  ann  nation  system  is  an  intelligent  database  for  LOTTA. 
It  is  composed  of  a  feature  extraction  system  and  a  feature 
watcher.  The  feature  extraction  system  allows  both  simple 
and  complex  time-varying  features  to  be  calculated  and 
stored  (along  with  the  features  or  parameters  that  they 
depend  on  and  the  methods  to  update  them  over  time). 
Every  feature,  internally  or  externally  computed,  has  multi¬ 
ple  views  (numerical  and  graphical  representations  to  allow 
either  people  or  computer  programs  to  use  them  in  decision¬ 
making  or  explanation  tasks).  The  feature  watcher  maintains 
the  dependency  directed  information  that  characterizes  the 
dynamic  support  of  the  features.  The  watcher  will  guide  the 
"lazy"  recomputations  of  those  features  whose  support  has 
changed  since  the  last  computation. 


Figure  2  shows  a  split  screen  in  which  two  players.  Blue  and 
Orange,  using  the  Window  Manager  and  the  Annotation  Sys¬ 
tem  can  observe  the  location  of  their  own  units,  the  limits  of 
the  territorial  waters  Ragged  line)  and  the  shipping  lane  that 
is  vertically  crossing  the  screen.  One  of  the  players  (Blue) 
can  also  see  the  coverage  provided  by  his  surface  radar.  The 
coverage  is  represented  by  a  changing  gray  level  distribution 
that  represents  the  probability  of  detecting  any  other  unit  in 
that  range. 

LOTTA  is  the  simulator  that  executes  commands  and  main¬ 
tains  internal  states.  Each  simulation  unit  is  a  Flavor  objeet 
that  is  a  node  in  a  Flavors  graph.  Message  passing  is  the  uni¬ 
form  communication  paradigm  for  sending  commands  and 
modifying  the  internal  states  of  the  objeets.  A  simulation  cy- 
ele  corresponds  to  a  real-time  variable  that  is  eommon  to  all 
the  players.  The  simulation  eyele  is  divided  into  12  phases: 
GAME-SYNCH,  SENSOR  (initialization,  send,  reeeive, 
ECM),  MOVEMENT,  SENSOR  (initialization,  send,  receive, 
ECM),  COMBAT  (C1DS,  offensive). 

KEELA  (KEE  to  LottA  interface)  links  LOTTA  with  KEE, 
the  expert  system  shell  that  provides  the  capabilities  to  write 
and  exeeute  the  rules  describing  the  policies  of  the  player. 
A  new  rule  system  has  been  implemented  in  KEE,  to  allow 
the  representation,  use,  and  control  of  the  different  uncer¬ 
tainty  calculi.  KEELA  is  based  on  FLUTE  (FLavor  to  Units 
Translate >'i  Environment).  FLUTE  transforms  a  Flavors 
graph,  such  as  the  LOTTA  objeets  graph,  into  a  graph  of 
KEE  Units,  with  their  corresponding  slots  and  faeets.  This 
generates  a  "vocabulary"  (names  and  structures)  of  the  ob¬ 
jects  in  LOTTA.  This  transformation  enables  the  program¬ 
mer  to  use  the  KEE  browser  to  generate  a  pictorial  represen¬ 


tation  of  the  graph,  providing  an  aid  for  debugging  and  docu¬ 
mentation.  Utility  functions  for  display/explanation  link 
KEE  to  LOTTA’s  Feature  and  Window  Managers. 

3.  The  Information  Fusion/Situation  Assessment  Problem 

The  Information  Fusion  (IF)/Situation  Assessment  (SA)  re¬ 
quires  a  variety  of  tasks  in  which  uncertainty  pervades  both 
the  input  data  and  the  knowledge  bases.  Beside  its  intrinsic 
uncertainty,  usually  the  information  dealt  in  each  task  is  also 
incomplete,  lime-varying,  and,  sometimes,  erroneous.  Thus, 
the  SA  problem  represents  a  strong  challenge  for  most  au¬ 
tomated  reasoning  systems,  since  it  requires  an  integration  of 
the  uncertainty  management  with  a  truth  maintenance  sys¬ 
tem  (belief  revision  system)  to  maintain  the  integrity  of  the 
inference  base  (or  of  its  relevant  subset).  The  SA  problem 
also  requires  the  reasoning  system  to  detect  useless  and  con¬ 
tradicting  information,  rejecting  she  former  and  resolving  the 
latter. 

There  is  no  uniformly  agreed  definition  of  what  a  situation 
assessment  problem  entails.  The  following  definitions  have 
been  compiled  and  summarized  from  a  variety  of  sources 
[Levitt  1984],  [Clarkson  1981]  to  succinctly  describe  the  SA 
problem.  Given  a  platform  (aircraft,  ship,  tank)  in  a  poten¬ 
tially  hostile  environment,  the  process  of  performing  Situa¬ 
tion  Assessment  consists  of  the  following  tasks: 

1.  Sensor  data  must  be  collected  ftom  various  sourees  and 
described  as  reports. 


Figure  2. 
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2  Time-stamped  sensor  reports  must  be  consolidated  into 
traeks  (each  traek  is  the  trace  of  an  object  followed  by  a 
given  sensor). 

3.  Tracks  associated  to  the  same  object  must  be  fused  into 
a  platform. 

4.  The  detected  platform  must  be  classified  and  identified 
(by  class  and  type). 

5.  Node  organization  (formation  of  the  identified  plat¬ 
forms),  use  of  special  equipment,  and  maneuvering 
must  be  recognized. 

6.  Using  the  knowledge  of  the  opponent’s  doctrines  and 
rules  of  engagement,  the  recognized  formation  and  ob¬ 
served  use  of  special  equipment  must  be  explained  by  a 
probable  intent,  which  is  then  translated  into  a  threat  as¬ 
sessment  (retrospective  SA). 

7  This  analysis  is  then  projected  into  the  future  to  evaluate 
plausible  plans  and  to  determine  likely  interesting 
developments  of  the  current  situation  (prospective  SA). 

The  first  four  tasks  constitute  v'hat  is  generally  known  as  In¬ 
formation  Fusion  [Dillard  1978;  83]  and  define  the  scope  of 
the  first  MS/MT  experiment. 


English  Version  of  Rule-550  (identifying  submarines): 

Assuming  that  a  sonar  was  used  to  generate  a  sensor  report  (that 
with  other  reports  generated  by  the  same  sensor  has  been  attached 
to  a  track  associated  with  a  platform),  if  the  detected  platform  has 
a  low  noise  emission,  and  is  located  at  a  depth  of  at  least  twenty 
meters,  then  it  is  extremely  likely  that  it  is  a  submarine.  Other¬ 
wise,  it  may  not  be  a  submarine. 

RUM’s  Version  of  the  same  rule; 

(add-template  ’sub.pos.id-sonar-550 
’msmt 

’((is-value?  ?report  ’noise-emissions 
’low) 

(u-lessp  (get.uncertain.value  ?report 
’elevation)  (fuzz-20))) 

’((?report  elevation)) 

’(?report) 

’((is-in-class?  (get.value  ?report 
’track)  ’source  ’(sonar  lotta))) 

'(extremely.likely  it.may) 

’t3 

'(submarine  report  .templates)) 


;  Name 
;  KB 

;  Premise-list 


;  List  of  wITs  in  premise 
;  List  of  units  in  premise 

;  Context 

;  sufficiency  and  necessity 
;  Aggregation  3-norm 
;  Rule  class  & 

instantiation  tempi. 


3.1  Example  of  RUM  rules 

The  RU'  I  knowledge  base  (KB)  used  in  MS/MT  application 
is  eompr  sed  of  approximately  forty  rules,  each  of  which  can 
be  instantiated  by  new  sensor  reports,  new  tracks,  or  new 
platforms.  A  representative  sample  of  such  a  KB  is  provided 
by  the  following  two  rules. 

English  Version  of  Rule-500  (identifying  submarines): 

Assuming  that  a  radar  was  used  to  generate  a  sensor  report  (that 
with  other  reports  generated  by  the  same  sensor  has  been  attached 
to  a  track  associated  with  a  platform),  if  the  first  time  that  the 
platform  was  delected  (in  the  track’s  first  report),  the  platform 
was  located  at  a  distance  of  at  most  twenty  miles  from  our  radar 
('. e. ,  it  was  a  cla.se-dislan  e  radar  pop-up)  then  it  is  most  likely 
that  the  plafform  is  a  submarine.  Otherwise,  there  is  a  small 
chance  that  it  is  not  a  submarine. 

RUM’s  Veision  of  the  same  rule: 

(add-template  ’snb.pos.it)- 
dose.pop.u  p-500 
'msmt 

’((u-lcssp  (gel .imcerlai n. value 
(get.value  ?t rack  ’first. report) 

’range)  (fu//  20))) 

(((get.value  /track  ’platform) 
cln.sh.namc  submarine  s2.ru les )) 

’((/track  first. report)) 

’(/track) 

’((is-in-class/  (get.value  /report 
’track)  ’source  ’(radar  lotta)) 

'(most. likely  small. chance) 

33 

’(tubinai  ino  track.templates)) 


3.2  Notes  on  the  Calculi  Selection  for  Rule  500  and  550 

The  T-norm  used  to  detach  the  conclusion  of  rule  500  and 
550  is  Tj.  This  is  due  to  the  fact  that  we  want  to  obtain  the 
smallest  certainty  interval  associated  with  the  detached  con¬ 
clusion.  The  T-eonorm  used  to  aggregate  the  certainties  of 
the  detachments  of  both  rules  is  5\.  This  assignment  indi¬ 
cates  a  lack  of  correlation  among  the  two  rules,  which  is  sub¬ 
stantiated  by  the  fact  that  independent  sources  of  informa¬ 
tion  (radar  and  sonar)  are  used  in  the  context  of  the  two 
rules. 

4.  The  Experiment 

In  the  experiment,  a  modified  version  of  the  naval  situation 
assessment  scenario  used  by  NQSC  to  test  STAMMER  and 
STAMMEP2  [McCall  1979,  Bechtel  1979,  Ferranti  1981]  was 
created.  In  this  modified  scenario,  a  missile  cruiser  of  the 
type  CGN36  operating  with  a  surfaee  radar  (SPS  10)  and  a 
passive  sensor  (GPS-3)  faced  two  platforms  (selected  from  a 
set  of  possible  platform  classes  such  as  cruisers,  destroyers, 
frigates,  patrol  hydrofoils,  submarines,  merchant  ships,  and 
fishing  boats).  One  of  the  two  platforms  was  using  an  active 
sensor  (navigational  radar),  while  t he  second  platform  was 
not  using  any  sensor. 

The  cruiser’s  task  was  to  (rack,  correlate,  and  classify  each 
delected  object.  Both  passive  and  active  sensors  on  the 
cruiser  were  run  twice  (during  one  LOTTA  cycle),  generating 
sensor  reports  which  were  translated  through  the  KEELA  in¬ 
terface  into  observed  wffk.  The  reports  were  then  grouped 
into  tracks  for  each  sensor.  A  total  of  three  tracks  were  gen 
crated,  two  tracks  were  produced  by  the  cruiser’s  active  sen¬ 
sor  and  one  by  its  passive  sensor.  Plausible  correlations  were 
made  among  the  tracks  to  correctly  group  them  into  the  two 
detected  platforms  Figure  3  graphically  illustrates  a  portion 
of  the  knowledge  base  where  the  report,  track,  and  platform 
information  is  stored.  In  the  same  figure  it  is  possible  to  ob- 
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;  Sufficiency  and  necessity 
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instantiation  tempi. 


serve  the  rule  k.t' mtiation  (by  track)  of  the  two  rules  (500 
and  550)  described  in  section  3.1. 

Figure  4  shows  the  sensor  report  information  generated  by 
running  a  sensor  once.  The  parameters  obtained  from  this 
particular  passive  sensor  (GPS-3)  were  the  heading,  position, 
range,  speed,  and  time  at  which  the  platform  was  detected. 
This  information  was  attached  to  a  track  (TRACK-0)  which 
maintained  subsequent  sensor  reports  generated  by  the  same 
sensor  (GPS-3)  and  associated  with  the  same  platform 
(PLATFORM-439).  Figure  5  describes  the  track  informa¬ 
tion. 

Another  track  (TRACK-3)  was  generated  by  using  a  second 
sensor  (SPS-10).  The  information  from  both  tracks  was  at¬ 
tached  to  the  same  platform.  The  query  posed  to  RUM  was 
to  deduce  the  class  value  of  such  platform  from  the  tracks 
information.  Using  the  RUM  knowledge  base  and  the  back¬ 
ward  chaining  mode,  various  attributes  of  the  platform  were 
inferred  or  observed  The  platform  was  correctly  identified 
as  a  merchant  ship.  This  conclusion  was  made  by  a  set  of 
rules  based  on  the  fact  that  the  platform  was  reasonably  close 
to  a  shipping  lane,  it  was  traveling  at  a  a  typical  merchant's 
speed  (in  the  9-14  miles/hour  range),  it  was  not  maneuver¬ 
ing,  nor  was  it  trying  to  dodge  the  cruise’s  surface  radar. 
Figure  6  shows  the  attributes  of  Platform-439,  which  were 
required  to  determine  the  platform’s  class  value,  as  well  as 
the  class  value  itself. 

Figure  7  shows  the  uncertainty  information  and  incta- 
information  associated  with  the  value  assignments  to  the 
variable  [ Platform  439  Class-name] .  In  the  slot  VALUES,  we 
can  see  the  platform  classes  which  were  considered  by  the 
system  and  their  corresponding  certainty  bounds1. 


Merchant  [  69  1],  Submarine  [0  .2],  Fishing  Boat  [0  .02] 

The  best  value  in  terms  of  its  certainty  is  clearly  the  one 
which  identifies  Platform-439  as  a  Merchant.  Its  certainty’s 
lower  bound  indicates  a  reasonably  large  amount  of  positive 
(confirming)  evidence.  Its  upper  bound  indicates  the  ab¬ 
sence  of  any  negative  (refuting)  evidence.  The  class  Subma¬ 
rine  obtained  no  confirming  evidence  and  a  large  amount  of 
negative  evidence.  The  refuting  evidence  was  provided  by 
rule  500,  which  from  the  failure  to  observe  a  close-distance 
radar  pop-up  determined  that  there  was  only  a  small  chance 
for  the  platform  to  be  a  submarine.  The  class  Fishing  Boat 
also  had  no  confirming  evidence  and  an  overwhelming 
amount  of  negative  evidence.  This  refuting  evidence  was 
due  to  the  fact  that  the  platform  was  too  far  from  the  fishing 
areas,  too  big,  and  w:  -  using  a  radar  (rules  340,  320,  and 
330).  This  information  can  be  obtained  from  Figure  7,  by  ob¬ 
serving  the  logical  suppor.  for  each  value,  and  from  Figure  8, 
hy  observing  the  dominant  rules  for  each  value. 

The  same  Figure  7  shows  the  logical  support  for  each  of  the 
three  values  considered  for  the  wff  [Platform-439  Class- 
name].  Each  rule  instance,  fired  to  infer  a  value  of  the  wff, 
has  a  cached  certainty  value  (lower  and  upper  bounds)  and 
an  associated  validity  flag.  This  was  illustrated  in  Figure  4  of 
the  previous  paper,  where  each  rule  instance  was  schemati¬ 
cally  identified  as  a  gate  in  an  actclic  graph.  The  rule  in¬ 
stances’  cached  certainty  values  are  illustrated  in  Figure  8 
and  show  which  rules  dominated  the  certainty  assignment  for 
each  of  the  three  values. 
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Ill  (Output )  Ihe  HeyuHI^4<0 Untlin Mj’M I  VpCrwwtgtlqpM^se 


Unit:  REPORT -44  0  in  knowledge  b«M  »MMT 

Own  Slot:  SPEEO  from  REPORT-440 

Created  by  Leaning  Jowler  on  4-22-*?  1 1:3 1:44 

Inheritance:  VARIABLE  .VALUES 

Modified  by  Leaning  Jowlar  on  4-22-8?  18:3 LSI 

ValueClass:  NUMBER 

Mem  bar  Of:  REPORT 

Cardinality  Max:  1 

Atunits:  AV.CERT  JJMT  in  HAROWARE 

Created  Automagically  by  ProcewJraage 

Urole  ALL 

- - — . . 

Values:  IS 

Own  slot:  ELEVATION  from  REFORT-440 

Inheritance :  VARIABLE, VALUES 

Own  slot:  T»*  rrom  REPORT-440 

Atunits;  AV.CCRT.UMT  in  HAROWARE 

Inheritance;  VARIABLE  .VALUES 

Urole:  ALL 

ValueClass:  NUMBER 

Valuts:  UNBOUND 

CardinatityMax:  1 

Atunits:  AV-CERTJjreir  in  HAROWARE 

Own  sloe  l*ADING  from  REPORT-440 

Urole:  ALL 

Inheritance:  variable. VALUES 

Values .  8.S 

Atunits:  AV.CERT.UMT  In  HAROWARE 

Urale:  ALL 

Own  slot  TRACK  from  REPORT -4 40 

Values:  180 

Inheritance:  VARIABLE  VALUES 

ValueClass:  TRACK 

Own  Slot  LOITAJMAOE  fri-  REPORT-440 

Values  TRACK-d 

Inheritance  VARIABLE. VALUE 5 

ValueClass:  UNIT 

Values:  #<1MAQE  Hewkieya  881857  0  in  LOTTA 

Own  slot;  NOISE -EMISSIONS  from  REPORT -440 

Inheritance.  VARIABLE, VALUES 

Avunits:  AV  .CERT  . UNIT  In  HAROWARE 

Urale.  ALL 

Politer:  UNBOUND 

Own  slot:  POSITION  from  REPORT -440 

Inheritance:  VARIABLE. VALUES 

Avunits:  AV.CCRT.UNIT  fn  HARD 41  ARE 

Urale:  ALL 

Values  (51  95) 

Own  Slot:  TiaNQE  from  REPORT-440 

Inheritance:  VARIABLE. VALUES 

ValueClass.  NUMBER 

Avunits:  AV.CERT .UNIT  In  HAROWARE 

Urale.  ALL 

Values:  23.43075 

Own  slot:  SIONAL -STRENGTH  from  REPORT -44  0 

Inheritance  VARIA3LE  VALUES 

Cardinality. Max  1 

Xvunttr  AV.CERT  UNIT  In  HAROWARE 

Urale:  ALL 

Values:  UNBOUND 

Own  slot:  SPEEO  from  RF.PORT-4  4  0 

Inheritance  VARIABLE  VALUES 

ValueClass;  NUMBER 

Figure  4.  REPORT-440  Attached  to  TRACK-0 


dJntt;  I  RACK  -0  in  knowledge  base  MSMT 
[created  by  Leaning  Jowlar  on  4-22-8?  11.31:46 
Modified  by  Leaning  Jowter  on  4-?2-87  WiSlsJ? 
Member  Of-  TRACK 

Created  by  Process  Track 


I  The  TRACK -0  Unit  in  MSMT  Knowlerfqr  Rase 


wmm 

|own  Slot  REPORTS  from  TRACK-0 
Inheritance  VARIABLE. VALUES 

Avunits:  AV. TIME, SORT,  AV.FIRST  .REPORT,  AVAAST  .REPORT, 

AV.MAX.MEAOINa.CHANOE,  AV.MAX.SPEEO, CHANGE,  AV .MIN.  SPEEO 
Vogues.  (REPORT-441  REPORT-440) 


[Own  slot  FIRST. REPORT  from  TRACK-0 
Inheritance.  VARIABLE  .VALUES 
ValueClass:  REPORT 
Avonifs:  A  V.  CERT. UNI  I  in  HAROWARE 
Urate:  ALL 
Comment:  Computed 
Values:  REPORT -4 41 

|Own  slot:  LAST  .REPORT  from  TRACK -0 
JnAeri'Arrws:  VARIABLE. VALUES 
ValueClass:  REPORT 
Avunits:  AV. CERT  .UNIT  In  HARDWARE 
Urote:  ALL 
Commeer.  Computed 
Values  REPORT -440 

|Own  slot:  LOTTA. TRACK  rrom  TRACK -0 
Inheritance  ■  VARIABLE. VALUES 
ValueClass  UNIT 

Values.  #< TRACK  Hewkeye  00  1B6BB>  In  LOTTA 

|Own  slot:  MAX.HEADINO.CHANDE  from  TRACK-0 
Inheritance ■  VARIARl  E. VALUES 
Avunits  -  AV.CERT  .UNIT  In  HAROWARE 
Utole  ALL 
C  omment  ■  Compu  ted 
Valuts  0 

>wn  Slot:  MAX. SPEEO, CHANGE  fTom  TRACK -0 
Inheritance  VARIABLE. VALUES 
XvuniM  AV.CERT  UNIT  In  HARDWARE 

UraL.  ALl. 

Commenf  Computed 
Values:  0 

)wn  Slot.  MIN.SPEFO  from  TRACK-0 
Inheritance'  VARIABLE  .VALUES 
Avunits.  AV.CERT. UNIT  In  HARDWARE 
Urole  ALL 
Comment  Computed 
Values  IS 

}wn  slot  PLAT  THM  from  TRACK-0 
Inheritance  VARIABLE.  VALUf  S 
ValueClass  PLATFORM 
Values  PLATFDRM-M9 


[Own  slot:  SOURCE  rrom  TRACK -0 
Inheritance  VARIABLE. VALUES 
Jhrtmto:  AV.CERT  UNIT  In  HARDWARE 
Urale:  ALL 
Values;  GPS-3 


FiRiirc  5  TRACK-0  Attached  to  PLA  IFORM-439 
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Figure  6.  Platform-439  Unit  with  Associated  Attributes 


'Own  *10*-;  CONSTRAINTS. RULES  from  PLATFORM- A 30-CLA33  NAME 

Inter  (tone*:  OVERRIDE  .VALUES 

Valaas  F15tnNa  BOAT.POSJO-OA-300  TAACK.-3  In  MSMT, 

-omnt  Rules  to  be  dlsjuncted  using  52. 

Own  Slot;  DEPENDENT. RULES  fTom  PLATFORM- A 3R -CL ASS.NAME 

Vr,t*3  V  ALPOS  JO-CLO  SE  .POP. UP-6  00-TRACK-  3  In  MSMT, 

ValaaCIass  DENERIC JAILE JJNtT  In  HARDWARE 

MERCHANT  .NEO  JO-OIST  ANT POP4JP-220-TRACK-0  In  MSMT, 

Wuej  MERCHANT. TYPE.UNK- 1000 -PLATFORM -A 3 &  In  MSMT, 

F1SHINQ.TV?EJ)NK-1010-PLATF0RM-A3»  in  MSMT 

SUR,  POS  JO-SONAR -1 60 -TRACK -0  tn  MSMT 

Owr  slat  OS. RULES  from  PLATFORM- A 30 -CLASS .NAME 

Interitanca:  OVERRIDE. VALUES 

Interitanca:  OVERRIDE  .VALUES 

ValaaCIass:  QENERIC  JKJLE.UNIT  In  HARDWARE 

Armas:  AV  BAD  In  HARO  WARE 

Ann i is:  AVJIAO  In  MAROWAHE 

Comment:  Rules  to  be  combined  using  tbs  Dempster  -Schaefer  conorm. 

Commant:  Rules  to  te  d  Is  ju  noted  using  S2.S. 

Valaas:  FISHtNO  BOAT  .NEO  JO-USE  JIAO  A  A- 330-TRACK -0  in  MSMT, 

Own  »J«t;  FLAU  from  PI  ATF0RM-439-CLASS.NAME 

FISHINO  .BOAT ,NEO JO-TOO.FA3T  3  1 0 -TRACK -3  In  MSMT, 

MERCHANT J>OSJO-OK- 100-TRACK-0  In  MSMT, 

Aamits:  AV  JLAO  In  HARDWARE,  AVJU.ERT  In  HARDWARE 

MERCHANT  .NEO  JO-TDO.SLOW-1  AO-TRACK-3  In  MSMT, 

MERCHANT .NEO -3-TOO JW.OW-  1 AO-TRACK-O  In  MSMT, 

MERCH.NEOJO-TDO .SMALL -200 -TRACK -3  In  MSMT, 

Comment;  Good  or  X. 

MERCH.NEO JO-T00.SMALL-200-TRACK-0  In  MSMT, 

V a  lit*  s:  GOOD 

FISHIKM.BOAT .NEO JO-TOO.FA3T-3  10-TRACK-0  In  MSMT, 

Own  slat:  NECESSITY  from  PLATFORM  -A3  0-CLA33.NAME 
inter  itanct  HVLRRIQE. VALUES 

FISHINQ.  BO  AT  .NEO  JO-USE  .RAO  AR'  330-TRACK-3  In  MSMT 

Own  Slot:  S3JKJLES  from  PLATFORM-4  30-CLASS  .NAME 

Avmits:  AVJ>OLLUTE  In  HARDWARE 

Interitanca.  OVERRIDE, VALUES 

Cardinality. Min:  1 

ValaaCIass:  OENERICRtAE.UNIT  In  HA  ROW  A  RE 

Cardinality. Man:  1 

Aranits.  AVJIAO  in  HARDWARE 

Commant:  Minimum  support  for  «  wff. 

Pa /u«J- FISHINO. BO/  .NEO.ID-TOO.FAfi-340-TRACK-O  In  MSMT 

Own  slot:  PLAUSIBILITY  from  PLATFORM- A 30 -CLASS. NAME 

FISHINO.  BO  AT .NEO. IO-TOO -FAR -3  40-TRACK -3  In  MSMT, 
FI3F0NQ.BOAT.NEO.IO-TOO.BH3-32O  -fRACK-3  In  MSMT, 

Interitancf  OVERRIDE  .VALUES 

MERCHANT  PRO  ,IO-DOOO€.STATIC.SENSOfl-250-PLATFORM-439  in  MSM 

Avmits:  A V .POLLUTE  In  HARDWARE 

T,  MERCHANT, NEO  JO-BAO.WEATHER-2BO-PLATFORM-439  In  MSMT, 

Cat  linai  tyM  in:  1 

CaidinalityMax:  t 

Commant.  Maximum  support  for  a  wff, 

Own  slot.  VALUE  from  PLATFORM-4  30-CLASS  J4AME 

Valaas:  1 

Interitanca:  OVERRIDE  .VALUES 

Own  slot:  S 1  .RULES  from  FT  ATFORM-A30-CLA3S.NAME 

Commant:  Vclue  of  slot. 

Valaas:  (SUBMARINE  MERCHANT  FISHING. BOAT), 

Interi tenet:  OVERRIDE  .VALUES 

(((0  0  0  0)  (0. 19999999  0.37  J  06  0  05)) 

Va  load 733  GENERIC  JKJLE.UNU  In  HARDWARE 

((0.(928104  0.81(568  0  081(9937  0.07845958)  (1  1  0  0)) 

Avunits.  AV.BAO  In  HARDWARE 

((0  0  0  0)  (C  019801915  0.039215(84  0.019802034  0.09162535)/) 

Commant  Rules  the.  are  to  be  dlsjuncted  using  SI. 

Valaas.  UNKNOWN 

Own  slot:  316JMJLES  from  PLATFORM -A 30 -CLASS  NAME 

Interitanca  OVERRIDE .VALUES 

ValutClass:  GENERIC. RULE  UNIT  In  HARO  WARE 

Figure  7. 
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w»iiaiiniiii&,a»B«mrot;sRMia,.H.'.>Biijia.<iMau.i.ijinn 

S2TRA.ES  from  PLATFORM-43S-CLA33.NAME 

What  la  the  SLOT -MAMET’ 

<rnc»:  OVERRIDE  .VALUES 

class. narte 

lass:  QENEMC  .RULE .UNIT  In  HARO  WARE 
.  AVJAQ  In  HAROWARC 

--  SOURCE  LIST  — 

t:  Rule*  w  be  dlsjunctrd  mini  52. 

.  SUB  .PCS  JO -CLOSE.  UP  .UP- 6  00 -TRACK -1  In  MSMT, 

For  T1  Co-norn: 

ME  RCHANT.NEO  JO -MANEUVERS -2 1 0 -PLATFORM- 4  31  In  MSMT, 
MERCHANT  .NEO  JO -DISTANT  ■POP.UP-220  TRACK -3  In  MSMT, 

For  T1.5 

Co-norn: 

MERCHANT MEO  JD-OSTANT.POP.UP-220-TRACK-O  In  MSMT , 
•UBPOSJO-CLOSE^OAJJP-SOO-TRAGK-a  In  MSMT, 

For  T2  Co-norrt: 

•U* POS JO-SONAR-5 60 -TRACK -3  In  MSMT, 

SUBMARINE 

:  [(0000),  (0.19999999  0,3?  0.06  0.05)  ]  :  G000 

St*  AOS  JO-SONAR -6  60 -TRACK -0  In  MSMT 

MERCHAMT 

t  (0.00999999  0.01999990  0.00999999  0.05),  (1  1  0  0)  3  :  IGNORANT 

52*. RULES  rrom  PLATFORM- 4 3S -CLASS .NAME 

MERCHANT 

[  (0.00999999  0.01999990  0.01  0.05),  (1100)]  :  IGNORANT 

atKt  ,•  OVERMOC. VALUES 

MERCHAMT 

[  (-1  -1  0  0),  (-1  -100)]  t  NA 

'lass:  OCNERIC  JNJLE-UMT  In  HARDWARE 

SUBMARINE 

:  [  (-1  -1  0  0),  (-1  -1  0  0)  ]  :  HA 

7  AVAAO  in  HARDWARE 

SUBMARINE 

:  [  (-1  -1  0  0),  (-1  -1  0  0)  ]  :  NA 

nt:  Rules  to  be  disjunct*!  using  I2.S. 

SUBMARINE 

:  [  (-1  -1  0  0),  (-1  -1  0  0)  ]  :  NA 

:  FtSHtNQ  JIOAT  ACQ  JO -CISC  JIAO  AR-330-T  RACK- 0  In  MSMT, 

FISHtNO JOAT MEO  JO-TOOJ^AST-S  1 0 -TRACK -3  in  MSMT, 

For  T2.5 

Co-norn: 

MERCHANT AOSJO-OK- 100 -TRACK -0  In  MSMT, 

FISHING. 0OA7  :  [(0000),  (0.00999999  0.01999990  0.01  0.049999952)  ]  :  GOO 

D 

MERCHANT  AOS  JD -OK -400 -TRACK -3  In  MSMT, 

MERCHANT.NEOJO-TOO .SLOW- 140 -TRACK-3  In  MSMT, 

FISHING. 0OAT  :  [(0000),  (0.00999999  0.01999990  0.01  0.049999952)  ]  i  GOO 

D 

MERCHANT .NEOJO-TOO.SLr'V-n 0 -TRACK -0  In  MSMT, 

MERCHACO  JO-TOO .SMALL  ^00 -TRACK -3  in  MSMT, 

NERCHANT 

[  (0.53  0.69  0.09000003  0.120000005),  (1  1  0  B)  ]  :  GOOD 

MERCHMEOJD-T00.3MALL-200-TRACK-0  In  MSMT, 

MERCHANT 

[  (0.53  0.69  0.09000003  0 .’ 20000005) ,  (1100)]:  G000 

EISrtNa.BGATA4EOJO-TOOAAST-310-TRACK-a  In  MSMT, 

NERCHANT 

MERCHANT 

[  (0.00999999  0.01999990  0.01  0.05),  (1100)]:  IGNORANT 
[  (0.00999999  0.01999990  0,01  0.05),  (1100)]  :  IGNORANT 

FtSHINQ  JIOAT .NEO  JD-USC .RADAR-330 -TRACK-3  In  MSMT 

NERCHANT 

[  (0.00999999  0.01999990  0.01  0.05),  (1100)]  •  IGNORANT 

S3.RULES  ETom  PLATFORM- 4  3  S-CLASS.NAME 

NERCHANT 

[  (-1  -1  0  0),  (-1  -100)]  :  NA 

anct  :  OVERRIDE  .VALUES 

FISHING. BOAT  :  [  (-1  -1  0  0),  (-1  -1  00)]:  NA 

lass:  DENERIC. RULE. UNIT  In  HARDWARE 

FISHING. 0OAT  j  [  (-1  -1  0  0),  (-1  -1  0  0)  ]  :  Nfl 

r:  AV  JAO  in  HARDWARE 

nt;  Rules  to  be  disjunct*)  using  S3. 

For  T3  Co-norn: 

,  FISHING.  BOAT  MEO  JO-TOO  AAR-34  0-TRACK -0  In  MSMT, 

FISHING. 0OAT  •  [(0000),  (0.13  0.25  0.04000002  0.03999996)  ]  :  G000 

FISHING  .BOAT  MEO  JD-TOOAAR-340-TRACK-3  In  MSMT 

FISHING. 0OAT  :  [(0000),  (0.13  0.25  0.04000002  0.03999996)  ]  t  G000 

F1SHMQ.BOAT.NEOJO-TDQ.MG-3 20 -TRACK-3  In  MSMT 

FISHING. 0OAT  :  [(0000),  (0.06999999  0.01999990  0.01  0.349999952)  ]  i  GOO 

D 

MERCHANT MEO  .KJ-OODOE  JTATIC3ENSOR-260-PLATF0RM-43B  in  MSM 

T,  MERCHANT MEO  JO-BAO.WEATHER-2  iO-PLATFORM-4  39  In  MSMT, 

MERCHANT 

MERCHANT 

[(6000),  (1100)]  :  IGNORANT 
[(0000),  (1  1  0  0)  ]  :  IGNORANT 

FtSHNO  .BOAT  ME  0  JO-TOO  J9IG -3  20 -TRACK -0  In  MSMT 

FISHING .  BOAT  :  [  (-1  -1  0  0),  (-1  -1  0  0)  ]  :  MA 

VALUE  from  PLATFORM-4 3B-CLASSMAME 
c*c«.OVERRIOE.  VALUES 

For  OS  Co-norn: 

nt-  Value  of  slot 

.  (SUBMARINE  MERCHANT  FISHING  .BOAT), 

For  PASS 

Co-norn: 

(((0  0  0  0)  (0.1993)999  0.37  0.06  0.0$)) 

((0.(921104  0.1165(1  0.00169937  0.07145950)  (1  1  9  0)) 

—  CO-NORN  LIST  -- 

For  T 1  Co-norn: 

For  T1.5  Co-norn: 

»*M0PE**i 

((0  0  0  0)  (0.019901915  0.039215614  0.019002»n  0.09162535))) 

Figure  8.  Uncertainty  Bounds  Detached  by  the  Rule  Instances  in  IPIatform-439 
Class. namel  Logical  Support 


5.  Remarks  ant  Conclusions 

RUM’s  layered  architecture  properly  addresses  the  require¬ 
ments  imposed  by  the  SA  problem.  The  MS/MT  experiment 
described  in  this  paper,  has  been  used  to  illustrate  RUM’s 
capabilities  in  an  IF/S  A  application.  It  is  a  complete  experi 
ment,  but  certainly  not  a  complex  one.  A  more  strenuous 
and  realistic  validation  of  RUM  is  in  progress:  currently 
RUM  is  successfully  being  used  as  the  reasoning  system  of 
the  Situation  Assessment  module  in  DARPA's  Pilot’s  Asso 
ciate  Program  [Sweet  19861.  In  this  application,  the  six  tasks 
(described  in  section  4)  that  comprise  the  retrospective  SA 
problem  are  addressed  by  RUM  in  Scenarios  involving  up  to 
twenty  platforms.  This  application  is  also  used  to  derive 
some  of  the  real-time  requirements  that  will  represent  the 
focus  of  future  development  work  in  RUM. 
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Abstract 

This  report  describes  an  experiment  in  using  an 
assumption-based  and  nonmonotonic  reasoning  capa¬ 
bility  in  support  of  strategic  analysis  as  envisioned 
by  Albert  Clarkson.  In  particular,  a  new  represen¬ 
tational  form  for  his  notion  of  a  threat  model  is  de¬ 
veloped  and  used  to  recode  his  North  Korean  threat 
scenario.  In  addition,  a  methodology  for  realizing  a 
selected  portion  of  his  functional  description  of  threat 
recognition  is  proposed.  This  methodology  is  demon¬ 
strated  in  a  prototype  problem  solver  and  a  hypothet¬ 
ical  threat  assessment  involving  Clarkson’s  encoded 
scenario.  Finally,  the  roles  of  causal  and  temporal 
reasoning,  uncertainty,  and  the  control  of  reasoning 
in  automating  threat  analysis  are  discussed. 

I.  Clarkson’s  View  of  Strategic 
Analysis 

In  his  book  Albert  Clarkson  [Clarkson,  1981]  presents  a 
concept  for  computer-based  strategic  analysis,  intended  to 
be  a  start  at  addressing  some  of  the  problems  involved  in 
this  activity,  e.g.,  human  limitations  in  reasoning  about 
quantities  of  data.  He  provides  a  specification  of  strate¬ 
gic  analysis  in  terms  of  three  functional  stages:  monitor¬ 
ing,  threat  recognition,  and  projection.  In  this  process 
model,  significant  indicators  are  recognized  by  the  moni 
toring  stage  and  associated  with  models  of  situations  by 
the  threat  recognition  stage.  The  models  are  then  used  for 
predictions  and  forecasts  by  the  projection  stage.  Clark¬ 
son’s  own  concept  for  realizing  strategic  analysis  entails 
the  design  of  various  forms  or  schemata  for  both  repre¬ 
senting  knowledge  and  hypothesizing  threat  situations.  He 
claims  that  they  should  be  flexible  in  order  to  support  the 
analyst’s  creative  process,  Such  forms,  together  with  ac¬ 
companying  analysis  routines,  are  viewed  by  Clarkson  as 
supporting  a  system  which 

preserves,  tailors  and  allows  manipulation  of  a 
large  operational  situation  of  prior  analysis  as  a 
framework  for  assigning  meaning  t.o  new  informa¬ 
tion. 

Focusing  on  the  threat  recognition  stage  of  analysis, 
Clarkson's  functional  view  of  this  activity  requires  that 
the  analyst  propose  hypothetical  situations  to  be  recog¬ 
nized  in  advance  of  their  full  impact.  The  aspects  of  t  iis 


stage  which  are  relevant  to  our  representation  of  threat 
assessment  presented  in  section  III.  include: 

•  correlating  indicators  with  preestablished  threat  mod¬ 
els 

•  correlating  other  previously  detected  critical  events 
with  preestablished  threat  models 

•  identifying  key  events  and  activities  whose  occur¬ 
rences  have  not  been  detected 

•  identifying  further  information  needs 

•  comparing  all  preestablished  threat,  models  with 
which  the  data  has  positively  correlated. 

The  preestablished  threat  models  attempt  to  capture 
“potential  courses  of  action  by  various  countries,  entities 
and  decision  makers”,  and  act  as  filters  through  which  the 
analyst  reviews  input  data.  Clarkson  suggests  a  realization 
of  these  models  through  the  implementation  of  the  follow¬ 
ing  high  level  representational  forms,  some  of  which  are 
also  useful  in  monitoring.  These  forms  provide  a  context 
for  the  indicators. 

•  The  PAMNACs  form  (Projected  Alternative  Major 
National  Courses  of  Action)  is  used  for  representing 
a  country’s  definitive  national  policies  and  courses  of 
action  with  respect  to  key  strategic  aspects.  It  must 
be  flexible  to  “accommodate  new  analytic  perspectives 
as  they  arise.” 

•  The  DENs  (Decision/Event  Networks)  are  extensions 
of  the  PAMNACs  intended  to  model  how  a.  particular 
course  of  action  might  be  implemented,  and  must  be 
easily  changeable. 

•  The  CEFs  (Critical  Event  Filters)  associate  with  a 
node  from  the  PAMNACs  or  DENS  additional  activ¬ 
ities  which  signify  that  node.  These  activities  can  be 
thought  of  as  events  which  ran  be  verified  by  incoming 
data. 

•  The  ANEMs  (Anomalous  Event  Matrices)  assist  the 
analyst  in  changing  his  existing  models  when  incom¬ 
ing  data  does  not  support  the  hypothesized  situations 
well. 

The  threat,  assessment  stage  then  corresponds  to  these 
forms  as  follows;  the  indicators  are  matched  against  the 
various  models;  the  correlations  are  identified;  the  models 
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Figure  1:  Example  PAMNAC. 
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Figure  2:  Example  DEN. 
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as  a  set  are  compared  in  terms  of  their  relative  levels  of 
activity;  novel  threat  analysis  is  conducted. 


II.  Representing  Threat 
Situations 

A.  A  Threat  Scenario 

Clarkson  provides  general  examples  of  his  representational 
forms  in  terms  of  a  hypothetical  national  security  scenario 
which  addresses  “the  problem  of  warning  of  hostile  activity 
directed  against  U.S.  interests  and  assets  by  North  Korea.” 
Pming  the  monitoring  stage,  indications  of  a  turbulent  sit¬ 
uation  developing  in  the  Republic  of  Korean  (ROK)  such 
as  increased  student  demonstrations  might  suggest  the  op¬ 
portunity  for  hostile  activity  by  North  Korea  against  ROK. 
Relevant  PAMNACs  and  DENs  (see  figures  1,  2.  and  3) 
include  nodes  representing  North  Korean  national  policies 
and  courses  of  action,  for  example,  with  respect  to  the  re¬ 
unification  of  the  Korean  peninsula  and  with  respect  to 
North  Korea’s  international  position.  This  scenario  ran 
be  reformulated  in  terms  of  a  new  approach  to  support¬ 
ing  threat  assessment,  presented  in  the  following  sections. 
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■  UNUSUAL  MEETINGSOF  LEAOERS 

•  DIPLOMATS  UROERED  HOME 
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GAINING  PERMISSION  TOVISIT 

•  REOUCTION  IN  TRAFFIC  ON  MAJOR 
COMMANO  CIRCUITS 

•  PRESENCE  OF  PROVINCIAL  LEAOERS 
IN  CAPITAL 

•  REOUCTION  IN  PUBLIC  APPEARANCES 
OF  PRINCIPAL  L  E AOERS  GOVERNMENT / 
PARTY 

ETC 


Figure  3:  Example  CEF. 

This  approach  does  not  address  the  novel  threat  assess¬ 
ment  supported  by  the  ANEM  forms. 

B.  A  Formal  Model  for  Situation  Assess¬ 
ment 

In  this  section,  a  model  for  situation  assessment  is  de¬ 
scribed  in  an  attempt  to  formalize  what  the  situation  as¬ 
sessment  problem  might  be  for  the  strategic  threat  analysis 
domain.  This  abstraction  is  inspired  by  the  diagnosis  prob¬ 
lem  [de  Kleer  and  Williams,  1986],  In  the  most  extreme 
case  diagnosis  is  carried  forth  in  an  environment  where  the 
models  of  the  mechanism  to  be  diagnosed  are  accurate  and 
complete,  and  the  operational  data  are  reliable  and  readily 
available.  In  diagnosis,  the  situation  assessment  problem 
is  to  enumerate  the  models  of  the  mechanism  compatible 
with  the  observed  data.  This  is  done  in  the  context  of 
a  structural  model  (which  remains  invariant  and  is  part 
of  the  mechanism’s  specification)  and  a  functional  model 
(which  may  include  local  deviations  from  the  device’s  spec¬ 
ified,  functional  model). 

The  struciural/functional  model  can  be  adopted  as  a 
general  paradigm  for  situation  assessment,  with  the  view 
that  “theory  formation”  is  not  a  situation  assessment  prob¬ 
lem.  The  structural  model  defines  the  essential  causality  of 
the  artifact  to  be  analyzed,  and  observables  are  interpreted 
within  the  confines  of  that  causal  model.  Some  consider¬ 
ations  that  distinguish  one  situation  assessment  problem 
from  another  are 

1.  the  extent  to  which  time  lias  a  role  in  the  interpreta¬ 
tion  of  structure  and  function, 

2.  the  accuracy  of  observations, 

3.  the  cost  of  observations, 

4.  the  availability  of  observations, 

5.  the  extent  to  which  observations  are  direct  evidence 
of  function. 
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In  this  light,  a  structural  and  functional  description  for 
one  view  of  strategic  situation  assessment  can  be  formu¬ 
lated.  A  structural  description  corresponds  to  PAMNACs 
and  DENs  and  can  be  viewed  as  a  directed  acyclic  graph 
each  of  whose  nodes  correspond  to  assuming  a  causal  link¬ 
age  among  particular  events.  Functional  descriptions,  on 
the  other  hand,  correspond  to  the  actual  logical  (entail- 
ment)  relations  among  events.  These  events  include  strate¬ 
gic  decisions  such  as  decide  whether  or  not  to  ask  the  Soviet 
Union  for  military  assistance.  (Though  most  of  these  de¬ 
cisions  are  under  the  control  of  North  Korea,  some-such  as 
the  USSR’s  deciding  whether  or  not  to  respond  favorably 
to  a  request  for  assistance-are  not.) 

We  implicitly  identify  each  of  the  assumption  nodes 
with  the  event  that  is  the  effect  of  the  corresponding  causal 
linkage.  Nodes  occurring  at  the  heads  of  the  connecting 
arcs  in  the  graph  are  interpreted  as  occurring  after  those 
at  the  tails  in  the  sense  that  the  corresponding  effects  of 
the  former  precede  the  corresponding  effects  of  the  latter. 
The  situation  assessment  problem  (informally)  is  to  deter¬ 
mine  which  of  the  causal  linkages  can  be  presumed  to  be  in 
effect.  These  causal  linkages  represent  the  decision  making 
structures  that  are  thought  to  prevail  in  North  Korea.  In 
support  of  the  reasoning  mechanisms  discussed  in  sec  tion 
III.,  the  si  ructural  model  can  be  represented  by  a  distribu¬ 
tive  lattice  of  entities  called  “situations.”  The  situations 
are  the  ‘nodes’  mentioned  above,  with  the  arcs  of  the  graph 
corresponding  to  the  ordering  of  the  lattice,  but  with  the 
opposite  sense.  Hence  the  sense  of  the  order  in  the  lattice 
is  the  opposite  of  that  usually  associated  with  time.  A 
situation  is  said  to  sanction  a  causal  relationship,  in  that 
a  reasoning  agent  takes  the  corresponding  nonmonotonic 
logical  relationship  as  axiomatic  only  when  the  sanction¬ 
ing  situation  is  part  of  its  current  assessment.  A  situation 
assessment  will  then  be  one  of  a  distinguished  collection  of 
the  possible  lattice  expressions. 

The  axioms  sanctioned  by  situations  are  called  causal 
rules,  and  are  one  aspect  of  the  functional  description.  The 
second  aspect  of  the  functional  description  corresponds  to 
CEFs.  These  are  represented  as  logical  entaihnents  called 
indicative  rules.  Indicative  rules  provide  evidence  that 
certain  events  have  occurred  that  cannot  be  directly  de¬ 
tected.  Indicative  rules  have  monotonic  antecedents  (indi¬ 
cators)  and  nonmonotonic  antecedents  (contraindieators). 
A  special  case  called  confirmation  rules  have  only  mono- 
tonic  antecedents.  The  set  of  monotonic  antecedents  of  an 
indicative  rule  is  railed  a  critical  set.  Notice  that  the  same 
formula  may  appear  as  a  monotonic  antecedent  of  more 
than  one  indicative  rule. 

A  situation  assessment  begins  by  assuming  the  sit¬ 
uation  which  corresponds  to  the  belief  that  none  of  the 
causal  rules  is  currently  sanctioned.  That  is,  NK’s  de¬ 
cision  making  processes  are  inactive.  As  data  comes  in. 
rules  are  triggered  that  cause  the  current  assessment  to  be¬ 
come  a  (noil-trivial)  conjunction  of  situations  (correspond¬ 
ing  causal  rules  that  are  consistent  with  and  supportive  of 
the  current  observations).  Because  of  the  contraindica- 


tors,  it  is  possible  for  the  belief  status  of  indicated  events 
to  oscillate,  that  is,  vary  between  believed  and  not  be¬ 
lieved  (“IN”  and  “OUT”  [Doyle,  1979]).  In  the  special 
case  that  belief  is  supported  by  a  confirmation  rule  the 
indicated  event  will  be  permanently  adopted.  To  some  ex¬ 
tent  the  reasoning  agent’s  problem  is  to  choose  between 
the  definitive  evidence  given  by  confirmation  rules  and  the 
suggestive  evidence  given  by  indicative  rules  having  non¬ 
monotonic  antecedents. 

A  crucial  element  of  the  situation  assessment  process 
is  the  control  of  the  inferences  and  observation  requests 
made  in  order  to  arrive  at  the  current  assessment.  The 
concept  for  a  situation  assessment  problem  solver  outlined 
in  section  III.  supports  such  control  at  various  levels.  Sev¬ 
eral  observations  can  be  made  about  this  approach.  For 
simplification,  situation  assessments  are  defined  with  re¬ 
spect  to  a  fixed  theory.  Causal  and  indicative  rules  are 
propositional  in  nature  and  do  not  have  explicit  tempo¬ 
ral  or  certainty  aspects.  Such  oversimplifications  result  in 
inadequate  expression  of  temporal  relations  and  general 
logical  relations. 

C.  A  Formal  Language  for  Situation  As¬ 
sessment  Models 

The  situations  and  causal  and  indicative  rules  of  situation 
assessment  models  can  be  represented  in  a  formal  system 
of  logic  called  Epilog.  Epilog  is  an  epistemic,  modal,  non¬ 
monotonic  extension  of  propositional  logic  equipped  with 
a  minimal  model  semantics  [Brown,  1987],  Epilog’s  proof 
theory  is  based  on  ordinary  propositional  deduction  to¬ 
gether  with  lattice  theory-based  reason  maintenance  [Be- 
nanav  et  al. ,  198C,  Brown  et  al. ,  1987,  Brown,  1987]  and 
the  rule  of  necessitation.  Informally,  the  Epilog  syntax 
contains  the  following  entities: 

•  Propositions 

P ,  q,  s,  * 

•  Clauses 

P,  q  <-  P,  <1  <-  9l  A  •  •  ■  A  q„ 

•  Beliefs 

Op,  O-iq,  -.Os, 

•  Justifications 

Dbb «—  HnH?i  a  ■  •  •  a  HnH?,, 

•  Situations 

T,  ±,  M ,  P.  Q.  S,  P,  P  U  Q.  P  ri  Q 

•  Axioms 

situation  — »  clause,  situation  — >  justification 

Theories  are  sets  of  axioms  whose  situations  when  con¬ 
joined  are  not  algebraically  equal  to  X.  The  prototype 
problem  solver  whose  operation  is  described  in  section 
III.B.  currently  uses  the  forward  and  backward  chaining 
machinery  of  the  Intellicorp’s  IvEEIM  3.0  rule  system,  a 
propositional  prover.  and  reason  maintenance  to  produce 
the  inferences  that  would  be  produced  by  a  "native’  ini 
plcmontation  of  Epilog. 


D.  An  Encoding  of  a  Threat  Model 

In  this  section  we  present  a  portion  of  the  structural  and 
functional  description  of  the  North  Korea  scenario  (with 
extensions).  We  base  this  on  the  model  described  in  sec¬ 
tion  II. B.  There  is  for  each  situation,  -A,-,  sanctioning  a 
causal  linkage,  an  event,  a,,  that  is  the  effect  of  that  link¬ 
age.  The  linkage  itself  will  be  represented  as  a  logical  im¬ 
plication.  Thus  when  the  situation  assessment  reasoning 
agent  has  d,  as  a  conjunctive  component  of  assessment, 
the  agent  sanctions  the  notion  that  (dis-)belief  in  the  (non¬ 
monotonic  antecedents  entails  belief  in  the  consequent  or 
effect  of  the  causal  linkage.  Using  the  abbreviations  ‘NK’, 
‘ROK’,  ‘USSR’,  and  ‘PRC’  for  North  Korea,  the  Republic 
of  Korea,  the  Soviet  Union,  and  the  Peoples’  Republic  of 
China  respectively,  the  following  propositions  correspond 
to  decision  events: 

•  ,sl  -  North  Korea  makes  a  policy  change 

•  ql  -  NI<  seeks  unification  of  the  peninsula  by  military 
means  after  U.S.  military  withdrawal  via  protracted, 
all  out  war 

•  qi  -  NK  seeks  PRC  military  aid 

•  qb  -  NK  seeks  USSR  military  aid 

•  q6  -  NK  invests  in  increase  in  own  military  output 

•  q7  -  NI<  initiates  campaign  to  keep  Japan  neutral 

•  q8  -  NK  develops  cover  and  deception 

•  qQ  -  NK  builds  up  agents  in  ROK 

•  </18  -  PRC  decision  to  help  NK  negative 

•  <y  1 9  -  PRC  decision  to  help  NI<  positive 

•  q20  -  USSR  decision  to  help  NK  negative 

•  q21  -  USSR  decision  to  help  NK  positive 

•  q22  -  higher  level  NK  effort 

•  q23  -  lower  level  NK  effort 

•  ql2  -  NK  initiates  build-up 

•  q27  -  NK  initiates  high  increase  in  one  or  more  of 
aircraft,  ground  forces  equipment,  naval  vessels 

•  q28  -  NK  initiates  low  increase  in  one  or  more  of  air 
craft,  ground  forces  equipment,  naval  vessels 

•  </13  -  NK  initiates  one  or  more  of  economic  programs 
or  political  programs  to  keep  Japan  neutral 

•  </15  -  NK  develops  cover  and  deception  through  one 
or  more  of  sigint,  forces/equipment,  false  actions 

•  pi  -  NK  seeks  unification  of  the  peninsula  by  military 
means  before  U.S.  military  withdrawal  via  quick  strike 

The  following  additional  propositions  (indicators)  cor¬ 
respond  to  events  and  are  needed  for  indicative  rules: 

•  /I  -  unusual  meetings  of  NK  leaders 

•  <2  -  diplomats  ordered  home  from  NK  capital 

•  <3  difficulty  in  foreign  visitors  gaining  permission  to 
visit  NK  capital 


•  <4  -  reduction  in  traffic  on  major  command  circuits  in 
NK  capital 

•  <5  -  presence  of  provincial  leaders  in  NK  capital 

•  <6  -  reduction  in  public  appearances  of  principal  NK 
leaders  government /party 

•  <7  -  NI<  diplomatic  meetings  with  PRC 

•  <8  -  NK  diplomatic  meetings  with  USSR 

•  <9  -  intelligence  reports  USSR  decision  to  support  NK 
negative 

•  tlO  -  death  of  important  NK  political  figure 

•  <11  -  NK  economic  crisis 

•  <12  -  NK  internal  political  conflict 

•  <13  -  NK  military  leaders  present  in  PRC 

•  <14  -  NK  grants  PRC  economic  favors 

•  <15  -  pro  PRC  propaganda  in  NK  government  media 

•  <16  -  PRC  meetings  with  ROK 

•  <17  -  increased  NK  diplomatic  communication  with 
PRC 

•  <18  -  increased  NI<  diplomatic  communication  with 
USSR 

•  <19  -  intelligence  reports  high  increase  in  aircraft  in 
NK 

•  <20  -  convergence  of  NK  ground  units  in  areas  outside 
their  garrisons 

•  <21  -  intelligence  reports  low  increase  in  aircraft  in  NK 

Some  of  the  partial  order  relations  in  the  lattice  of 
situations  are  indicated  below. 
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Some  of  the  causal  rules  of  the  structural  model  follow. 
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Some  relevant  indicative  rules  include  the  following. 
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III.  Reasoning  About  Threat 
Situations 

A.  A  Situation  Assessment 

The  automation  of  threat  assessment  alluded  to  in  section 
II. B.  can  be  demonstrated  through  a  brief  reasoning  se¬ 
quence  based  on  Clarkson’s  North  Korean  threat  scenario. 
A  high  level  overview  of  the  reasoning  sequence  follows: 
Initially,  with  no  information  to  indicate  otherwise,  there 
are  no  imminent  threats  from  NK.  Eventually,  the  observa¬ 
tion  of  some  events  including  an  unusual  meeting  of  lead¬ 
ers  in  the  NK  capital  and  diplomats  being  ordered  home, 
indicates  that  NK  is  making  a  policy  change.  Such  a  pol¬ 
icy  change  may  lead  to  several  courses  of  action  on  the 
part  of  NK.  The  one  preferentially  assumed  (based  on  a 
plausibility  ordering  of  situations)  to  be  in  progress  is  ail 
attempt  at  unification  by  military  means  after  US  with¬ 
drawal,  via  a  protracted,  all  out  war.  It  is  reasonable  to 
believe  that  NK  is  seeking  USSR  military  aid  if  it  can  be 
determined  that  NIv  is  having  increased  diplomatic  com¬ 
munication  with  USSR  (which  is  indeed  observed),  and 
there  is  no  evidence  to  indicate  that  the  aid  request  is  not 
being  made.  It  is  also  reasonable  to  believe  that  the  USSR 
will  be  supportive  unless  there  is  evidence  to  indicate  oth¬ 
erwise.  From  assuming  such  assistance,  it  can  then  be 
assumed  that  NK  will  only  need  to  initiate  a  low  increase 
in  military  equipment  build-up.  The  receipt  of  an  intelli¬ 
gence  report  that  the  USSR  decision  was  not  in  support  of 
NK  contradicts  the  belief  that  they  would  be  supportive. 
This  leads  to  the  revised  assumptions  that  after  requesting 
USSR  aid  and  being  turned  down,  NK  will  have  a  higher 
level  of  effort  and  will  greatly  increase  its  military  forces. 

In  this  example,  the  information  about  diplomatic 
communication  with  USSR  suggests  a  proper  subset  of 
the  possible  situations  presented  in  the  structural  model. 
Deducing  that  this  information  discriminates  among  sev¬ 
eral  candidate  situations  involves  reasoning  control,  ».e., 
utilizing  selected  rules  sanctioned  by  selected  situations. 
For  example,  from  the  indication  of  a  policy  change,  it 
can  be  deduced  that  there  are  several  situations  sanction¬ 
ing  causal  entailincnts  whose  antecedents  include  a.  policy 
change.  From  any  particular  one,  it  may  be  possible  to 
deduce  that  certain  decision  events  have  already  been  ini 
t.iated.  From  such  predictions  it.  may  also  be  possible  to 
deduce  observations  to  be  made  that,  can  either  substanti¬ 
ate  the  initiation  or  eliminate  it.  In  the  specific  example 
above  assuming  that.  NK  is  seeking  unification  by  inili 
tary  means,  after  US  withdrawal,  via  all  out  war,  with 
the  assistance  from  the  USSR  but  not  the  PRC,  suggests 
diplomatic  interaction  between  NK  and  USSR.  Subsequent, 
reasoning  leads  to  the  belief  in  low  build-up  of  NK  forces 
which,  upon  receipt  of  new  information,  is  revised  to  belief 
in  high  build-up  of  NK  for -os.  If  it,  was  eventually  ob¬ 
served  that  a.  low  build-up  ot  iorces  was  in  progress,  the 
situation  assessment  sanctions  the  belief  that  the  PRC  was 
providing  assistance.  The  deductions  referred  to  above  are 
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performed  by  forward  and  backward  inference  using  the 
rules  of  the  structural/functional  model,  the  partial  order¬ 
ing  on  the  situations,  and  current  control  strategy.  Such  a 
reasoning  capability  is  supported  by  the  architecture  sug¬ 
gested  in  section  III.B. 

B.  A  Problem  Solving  Architecture  for 
Situation  Assessment 

The  situation  assessment  paradigm  envisioned  is  supported 
by  a  problem  solving  system  with  the  following  compo¬ 
nents:  a  reason  maintenance  system  (ANRMS),  a  deduc¬ 
tive  system  (DS)  consisting  of  a  propositional  theorem 
prover  and  a  defeasible  inference  mechanism  directly  sup¬ 
ported  by  the  ANRMS,  and  a  control  system  (CS).  The 
interfacing  among  these  subsystems  is  more  complicated 
than  simple  layering,  but  to  a  first  approximation  one  can 
imagine  them  to  be  layered  in  an  ascending  hierarchy  in 
the  order  cited. 

As  discussed  in  section  II. B.,  the  structural/functional 
model  is  encoded  as  a  collection  of  causal  rules  captur¬ 
ing  the  causality  between  various  decision  making  events, 
and  a  collection  of  indicative  rules  modelling  possible  an¬ 
tecedents  and  logical  consequences  of  such  decisions.  The 
situations  are  represented  as  ANRMS  Boolean  lattice  ele¬ 
ments,  and  the  justifications  and  beliefs  are  represented  as 
ANRMS  constraints  and  nodes  respectively.  The  justifica¬ 
tions  have  the  property  that  establishing  their  monotonic 
antecedents,  while  having  no  evidence  for  their  nonmono¬ 
tonic  antecedents  (defeasibly)  assures  their  consequents. 
The  clauses  and  propositions  and  represented  by  logical 
formulae.  Establishing  the  antecedents  of  a  material  impli¬ 
cation  can  be  used  to  confirm  the  consequent.  A  situation 
assessment  is  an  algebraic  expression  from  the  Boolean  lat¬ 
tice  of  situations. 

An  operational  situation  assessment  application 
(SAA)  is  a  collection  of  reasoning  agents  (or  processes), 
each  of  which  has  under  its  control  a  DS  with  a  corre¬ 
sponding  assessment  (or  state)  characterized  by  a  lattice 
expression.  A  reasoning  agent’s  DS  reasons  from  certain 
pieces  of  knowledge  represented  by  nodes  in  the  ANRMS. 
In  particular,  it  reasons  from  those  nodes  whose  reason 
maintenance  labels  have  disjuncts  that  are  consistent  with 
the  reasoning  agent’s  characterizing  lattice  expression  L , 

i.e.,  disjuncts  which  when  conjoined  with  L  do  not  sanc¬ 
tion  logical  falsehood  (contradiction).  A  given  reasoning 
agent  continues  to  reason  until  a  contradiction  is  produced, 
that,  is,  when  the  distinguished  reason  maintenance  node. 
FALSE,  is  entailed  by  that  agent’s  situation  assessment.  A 
reasoning  agent  is  termed  active  if  its  situation  assessment 
does  not  entail  the  FALSE  node.  At  that  point  a  new  rea¬ 
soning  agent  is  generated,  whose  situation  assessment  does 
not  (currently)  entail  an  inconsistency.  At.  any  given  time 
the  situation  assessment  is  the  (lattice- theoretic)  meet  of 
the  situation  assessment  expressions  of  each  of  the  rea¬ 
soning  agents  whose  assessments  do  not  currently  entail 
a  contradiction.  In  general  it  will  be  the  aspiration  of  a 
situation  assessment  application  to  have  exactly  one  ac¬ 


tive  reasoning  agent  whose  situation  assessment  will  have 
exactly  one  disjunct.  Arranging  for  this  to  be  the  case  is 
the  central  (and  essential)  control  issue  of  any  situation 
assessment  application. 

Such  a  problem  solving  paradigm  differs  from  de 
Kleer’s  General  Diagnostic  Engine  in  the  following  ways: 

1.  there  are  two  levels  of  inference:  inferences  about  the 
problem  solving  domain  (domain  deductions)  and  in¬ 
ferences  about  the  control  of  the  former  (control  de¬ 
ductions);  since  both  inference  structures  are  logic 
based,  one  is  said  to  have  deductive  control  of  infer¬ 
ence 

2.  the  deductive  system  for  the  problem  domain  has  vari¬ 
ous  “points  of  intercession”  at  which  deductive  control 
can  be  exercised 

3.  the  inferential  stance  of  both  layers  is  firmly  rooted  in 
logic 

4.  there  is  an  explicit  abstraction  of  the  notions  of  a  rea¬ 
soning  agent  and  a  reasoning  agent’s  “mental  state”; 
this  abstraction  is  presented  at  the  interface  between 
the  theorem  prover  responsible  for  domain  deductions 
and  the  theorem  prover  responsible  for  control  deduc¬ 
tions. 

C.  An  Instance  of  a  Problem  Solver 

A  realization  of  the  situation  assessment  problem  solving 
paradigm  discussed  in  III.B.  devolves  to  addressing  a  col¬ 
lection  of  control  issues.  These  issues  appear  to  fall  into  a 
natural  hierarchy  and  as  such  impose  a  natural  hierarchi¬ 
cal  structure  on  a  situation  assessment  problem  solver.  To 
each  control  issue  corresponds  a  control  policy  for  guid¬ 
ing  control,  a  policy  language  for  describing  the  policy, 
and  a  control  methodology  for  realizing  the  policy.  An  ex¬ 
ample  of  a  policy  might  be  the  prescript  to  assume  that 
either  part  A  is  working  or  that  it  is  not  working.  A  policy 
language  might  be  the  collection  of  conjunctive  lattice  ex¬ 
pressions  containing  the  assumption  A  or  its  complement. 
An  example  of  a  control  methodology  might  be  a  proce¬ 
dure  that  chose  only  deductions  entailed  by  the  previously 
mentioned  control  policy.  There  is  a  list  of  roughly  a  half 
dozen  control  issues,  arranged  in  a  two-tiered  hierarchy. 
The  two  tiers  correspond  roughly  to  deciding  when  to  cre¬ 
ate  a  reasoning  agent  and  when  to  schedule  it  to  run  vs. 
deciding  what  deductive  goal  a  reasoning  agent,  should  at¬ 
tempt  to  achieve  and  how  to  achieve  it: 

1.  reason  only  from  nodes  sanctioned  by  tlu'  reasoning 
agent’s  current  assessment; 

2.  use  only  axioms  sanctioned  by  the  reasoning  agent’s 
current  assessment; 

3.  in  general  causal  rules  are  forward  chained; 

4.  the  disambiguation  of  situation  assessments  is  to  take 
place  in  a  depth-first  fashion; 

5.  in  general,  indicative  rules  are  backward  chained  for 
the  purpose  of  establishing  of  the  most,  likely  discrim 


mating  measurement  to  disambiguate  a  situation  as¬ 
sessment. 

D.  A  Problem  Solver’s  Reasoning  Se¬ 
quence 

This  section  presents  an  example  reasoning  sequence  in  de¬ 
tail  in  terms  of  the  outlined  problem  solver  of  section  III.C. 
The  reasoning  sequence  of  section  III. A.  is  described  as  a 
sequence  of  deductive  states,  some  of  which  summarize 
several  deductive  steps.  At  each  state,  the  correspond¬ 
ing  situation  assessment  (which  necessarily  does  not  entail 
an  inconsistency)  is  based  on  the  current  set  of  beliefs. 
The  distinguished  assumption  M  represents  the  assump¬ 
tion  that  all  measurements  are  correct,  and  is  required  to 
be  entailed  by  all  situation  assessments. 

The  assessment  process  is  initiated  by  the  following 
sequence. 

•  The  reasoning  agent’s  initial  assessment  is  M  H  SI 

•  Unsolicited  observations  lead  to 
M  atl 

M  -*  Df2 
M  ->  □« 

M  -+  CD<4 
M  ->  nth 
M  -> 

•  Forward  chaining  on 

□si  «-  Ofl  A  D*2  A  Qf3  A  Bt4  A  nth  A  Oi6 

A-iOtio  a  -.am  a  -.am 

leads  to  M  — ►  Qsl 

•  The  reasoning  agent’s  assessment  supports  the  con¬ 
tradiction  M  n  Sl  — +  nsl  a  □-■si 

Initial  belief  revision  and  refinement  follows. 

•  The  most  preferable  initial  revision  is  M  n  SI 

•  The  subsequent  refinements  are  M  n  Ql,  M  □  Qb, 
M  n  Q21,  M  n  Q23,  and  M  n  Q28 

•  The  refinement  M  n  Qb  is  supported  by  backchaiuing 
from  Oqb  <—  Of  18  A  -iD->g5  which  solicits  the  obser¬ 
vation  of  tl8  yielding  M  — ♦  Dtl8  and  M  n  Qb  — * 

•  The  refinement  MDQ2±  is  supported  by  backchaining 
from  Dq21  *—  Og5  A -'□-'g21  which  yields  MV\Q2\  — + 
□g21 

•  The  reasoning  agent’s  new  best  assessment  is  M □  (j'2S 
New  e\  idence  leads  to  further  reasoning. 

•  Unsolicited  observation  leads  to  M  — +  n/9 

•  Forward  chaining  from  □  ~>g21  <—0/9  leads  to  M  — * 

0-i(jf2.1 

•  Forward  chaining  on  the  causal  rules  leads  to  M  n 
Q28  Bq21 

•  But  this  means  that  the  reasoning  agent’s  current 
assessment  supports  the  contradiction  M  Id  Q 2S  — ♦ 
□921  A  0-1921 


•  Note  also  that  M  — +  □-'§21  also  negates  the  earlier 
nonmonotonic  support  of  Og21 

Subsequent  belief  revision  and  refinement  updates  the 
current  assessment. 

•  Qb  being  the  least  (in  the  lattice)  situation  above  Q 28 
in  which  the  contradiction  goes  away,  the  reasoning 
agent’s  initial  reassessment  is  M  n  Qb 

•  Since  the  only  choices  for  refinement  are  M  fl  Q20, 
M  n  Q22,  and  M  11  Q27,  the  reasoning  agent’s  final 
assessment  is  M  11  Q 27 

IV.  Conclusions 

We  have  presented  a  formal  model  of  situation  assessment 
and  a  logical  language  and  interpreter  for  encoding  threat 
situations.  We  have  also  proposed  a  problem  solving  ar¬ 
chitecture  comprised  of  both  the  interpreter  and  a  system 
for  controlling  its  application. 

A  lattice  theoretic  reason  maintenance  system  serves 
as  the  focus  of  control  for  the  problem  solver  in  forward 
and  backward  chaining.  Lattice  situations  encode  both 
assessments  and  a  weak  (but  sufficient  for  the  present  pur¬ 
pose)  model  of  time.  Nonmonotonic  justification  allows 
the  weak  support  of  certain  conclusions  which,  with  new 
evidence,  may  be  withdrawn. 

The  structural/functional  model  is  encoded  in  the  log¬ 
ical  language.  Rules  which  are  sanctioned  by  particular 
situations  encode  the  causal  content  of  the  North  Korea 
strategic  decision  model  while  rules  sanctioned  by  all  sit¬ 
uations  encode  the  functional  content.  Candidate  situa¬ 
tion  generation  is  done  in  a  way  similar  to  the  practice  in 
diagnosis.  In  this  view  of  situation  assessment  the  com¬ 
putational  goal  is  not  merely  to  minimize  the  number  of 
observations  needed  to  formulate  an  assessment,  but  also 
to  make  the  most  reliable  observations.  Kautz  and  Allen 
[19S6]  present  a  promising  theoretical  framework  for  carry¬ 
ing  out  the  recognition  task  for  situation  assessment  prob¬ 
lems.  There  are  interesting  parallel0  and  contrasts  between 
their  work  and  our  own  efforts. 

Although  it  appears  that  Clarkson’s  model  of  situa¬ 
tion  assessment  (as  published)  is  adequately  representable 
in  the  paradigm  presented  here,  problems  would  arise 
should  strategic  situation  assessment  require  a  richer  rep¬ 
resentation  of  temporal  phenomena.  Indeed,  this  work 
demonstrates  essential  roles  for  the  representation  of 
causality,  time  and  uncertainty,  and  for  the  general  issue 
of  the  control  of  reasoning. 

In  our  model  time/causality  is  in  effect  represented 
by  material  implications  having  nonmonotonic  antecedents 
and  being  sanctioned  by  situations  that  serve  in  effect  as 
temporal  indices.  In  the  indications  and  warning  system 
of  Douglas  Lenat,  etal.  [Leuat  r.t  ai,  1983]  time  is  repre¬ 
sented  explicitly  through  the  use  of  a  blackboard  and  an 
interval  representation.  The  lessons  provided  by  Lonat’s 
work  along  with  a  theory  of  time  and  causality  related 
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to  the  one  developed  by  Shoham  [1986]  offer  a  promising 
framework  for  au  improved  representation. 

There  is  an  important  role  for  uncertainty  as  well. 
For  example,  the  number  of  verifiable  entries  in  the  CEF 
associated  with  an  event  in  the  PAMNACS  affects  the  de¬ 
gree  belief  in  that  event.  In  addition,  incoming  data  has  a 
particular  degree  of  meaning  with  respect  to  a  particular 
assessment.  Our  model  indirectly  addresses  uncertainty 
through  the  notion  of  a  critical  set,  and  assumes  a  fixed 
partial  order  for  determining  the  “best  discriminating  ob¬ 
servation.”  Lenat’s  system  provides  a  way  for  determining 
those  facts  whose  certainty  is  crucial  to  important  con¬ 
clusions  through  the  use  of  rules  which  have  ‘strong’  and 
‘weak’  conditions.  This  is  accomplished  by  running  the 
rules  in  two  separate  worlds,  one  in  which  only  strong  con¬ 
ditions  are  considered  and  one  in  which  only  weak  con¬ 
ditions  are  considered.  Major  discrepancies  in  important 
predictions  can  be  traced  to  facts  whose  validation  must 
be  given  further  attention.  It  appears  that  the  architec¬ 
ture  presented  here  could  provide  the  basis  for  supporting 
an  arbitrary  number  of  such  worlds. 

The  use  of  assumption  and  nonmonotonic  based  rea¬ 
soning  offers  a  point  of  departure  for  a  situation  assessment 
system  to  address  various  issues  in  the  control  of  inference, 
both  at  the  domain  dependent  (strategic)  and  domain  in¬ 
dependent  (tactical)  levels. 

Finally,  Clarkson  points  out  that  strategic  analysis  is 
not  really  a  linear  process  in  that  the  assumptions  made  in 
models  could  be  affected  by  the  analyses  of  the  projection 
stage.  He  also  distinguishes  between  changes  made  to  the 
models  and  changes  made  in  the  models,  the  former  being 
the  creation  of  a  new  context.  As  mentioned  earlier,  this 
work  does  not  address  the  problem  of  theory  formation, 
and  does  not  have  the  equivalent  of  ANEMs  in  its  view  of 
a  structural  model.  It  appears,  however,  that  assumption 
based  reasoning  could  play  an  important  role  in  projection 
analyses.  For  example,  knowing  what  events  are  shared 
by  several  possible  projections  w  uld  enable  measures  to 
be  taken  v.hich  prepare  .  gainst  mom  tha.r  ev'  possible 
outcome 
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Abstract 


Recent  work  in  automatic  diagnosis  has  employed  formal  logic  as  a  representation  and 
the  notion  of  consistency  as  the  test  for  causal  explanation.  This  paper  uses  the  notion 
of  variational  analysis  as  a  basis  for  a  formal  approach  to  diagnosis  which  develops  a 
set  of  differential  equations  to  represent  the  behavior  of  the  system  under  test.  Several 
examples  in  digital  logic  are  shown  in  which  specific  failure  modes  are  substituted  into  these 
equations  to  give  an  operational  representation  of  the  system.  The  resulting  equations  are 
evaluated  for  specific  measurements  to  give  a  set  of  constraint  equations  which  are  solved 
to  det.prminp  a.  diagnosis  (which  mav  be  multinle  fault V  Aside  from  being:  an  approach 


1  Introduction 

Some  of  the  recent  work  in  the  area  of  automated  diagnosis  has  been  moving  in  the  di¬ 
rection  of  using  formal  logic  to  represent  causal  knowledge  about  real  world  systems.  In 
[DW85]  and  [Rei85],  for  example,  digital  circuits  are  diagnosed  using  variations  of  this  rep¬ 
resentation  scheme.  In  this  paper  we  describe  an  alternate  representation  and  diagnosis 
method  which,  we  believe,  more  easilv  extends  to  comolex  devices  such  as  analog  hard- 


referred  to  as  the  relation  method.  It  is  used  as  the  primary  reasoning  technique  in  the 
MONAD  system  in  which  it  controls  most  of  the  general  search  processes  in  addition  to 
being  central  to  performing  diagnosis  and  analogy  tasks  which  are  the  target  applications 
of  that  system. 

In  this  paper  we  are  attempting  to  show  how  the  relation  method  can  be  used  for 
the  diagnosis  of  static  systems;  we  are  not  trying  to  reinvent  diagnosis.  The  important 
point  is  that  the  technique  described  here  is  uniform  with  those  applied  to  other  tasks 
that  are  addressed  by  the  MONAD  system  such  as  analyzing  static  systems,  proposing 
modifications  in  the  analogical  reasoning  method,  and  even  revising  the  knowledge  state 
as  part  of  the  reasoning  control  process.  To  understand  some  of  the  details  of  the  concept, 
we  apply  the  relation  method  and  its  associated  representation  to  the  diagnosis  of  digital 
circuits  as  discussed  in  [Rei85].  An  abstract  diagnosis  task  is  also  discussed  to  provide 
an  illustration  of  the  extensibility  of  the  method  and  representation.  We  will  only  treat 
static  circuitry  in  this  paper  since  time  dependent  reasoning  and  reasoning  about  flow 
systems  are  treated  using  the  reasoning  control  mechanism.  Some  systems  with  loops  can 
be  treated  as  an  extension  to  the  static  method  described  here  although  they  are  generally 
dealt  with  using  the  flow  analysis  method  which  handles  dynamic  systems. 

The  method  described  here  corresponds  to  a  process  in  the  MONAD  system  and 
therefore  has  the  goal  of  formulating  an  abstract  problem  specification  in  terms  of  a  set 
of  equations.  The  method  we  describe  here  is  an  example  of  such  a  process  for  static 
digital  logic.  Then,  is  a  corresponding  process  for  dealing  with  analog  circuits  based  on 
the  Lagrange  method  for  describing  systems.  It  develops  a  set  of  differential  equations  from 
an  abstract  circuit  description  while  the  method  here  develops  a  set  of  digital  differential 
equations  from  a  digital  circuit  specification.  Both  of  these  are  handled  uniformly  under 
the  relation  method  of  the  MONAD  system. 

2  The  Diagnosis  Process 

Many  of  the  diagnosis  systems  developed  over  the  past  few  years  use  the  method  of  in¬ 
consistency  to  hypothesize  the  failed  component1  of  the  target  system,  for  example,  see 
[Bon82].  Without  considering  specific  techniques,  this  method  determines  that  the  set  of 
behavioral  measurements  of  the  target  system  are  inconsistent  with  the  expected  measure¬ 
ment  values.  From  this  observation,  one  may  deduce  that  the  system  is  not  functioning 
properly  (by  the  definition  embedded  in  the  inconsistency  test).  From  these  inconsistencies 
an  explanation  of  the  malfunction  is  hypothesized  as  the  failure  of  one  component  in  the 
target  system.  This  last  step  is  somewhat  weak.  This  is  because  it  is  not  always  clear  that 
we  can  know  all  of  the  implications  of  a  component  failure. 


'For  the  moment  we  shall  restrict  our  comments  to  single  fault  diagnosis. 
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2.1  Reiter’s  Approaches 

In  this  section  we  will  briefly  mention  the  approach  of  [Rei85]  to  give  some  flavor  to 
our  discussion.  Reiter  defines  diagnosis  in  terms  of  the  normal  or  abnormal  operation  of 
each  of  its  components.  Each  failure  condition  of  the  component  set  can  be  modeled  as 
a  consistency  between  the  behavioral  specification  of  the  system  under  the  hypothesized 
failure  condition  and  the  observed  measurements.  This  is  only  correct  if  the  universe  of 
failure  modes  and  their  implications  is  completely  known.  We  will  review  this  notion  in 
Reiter’s  terms. 

Reiter  defines  a  system  as  the  pair  (S  D, COM  PON  ENTS)  where  SD  is  a  set  of  first 
order  sentences  representing  the  system  description  or  behavior,  and  COMPONENTS 
is  a  finite  set  of  constants  representing  the  components  of  the  system.  Observations  of  the 
state  of  the  system  are  represented  as  a  set  of  first  order  sentences  ODS.  Component  failure 
is  represented  in  terms  of  individual  components.  AB(c)  is  a  “distinguished”  predicate 
whose  intended  meaning  is  that  component  c  is  “abnormal”;  the  predicate  -> AB(c)  means 
that  c  is  functioning  “normally”.  If  all  the  components  of  the  system  are  functioning 
normally,  then  SD  U  OBS  U  {->AB(c)\c  £  COMPONENTS}  is  consistent. 

Then  A  is  a  diagnosis  for  (. SD ,  COMPONENTS ,  OBS)  if,  for  each  c;  6  A, 

SD  U  OBS  U  {-.AB(c)|c  €  COMPONENTS  -  A}  |=  AB(ct).  (1) 

This  notion  is  simplified  by  showing  that  A  C  COMPONENTS  is  a  diagnosis  for 
(SD,  COMPON ENTS,  OBS)  iff  A  is  a  minimal  set  such  that: 

SD  U  OBS  U  {-i  AB(c)\c  G  COMPONENTS  -  A}  (2) 

is  consistent. 

2.2  The  Source  of  the  Problem 

Reiter  has  neatly  separated  the  diagnosis  task  into  two  processes:  enumerating  the  possible 
failure  conditions  and  testing  consistency  of  the  observations  with  the  system  description. 
Although  both  of  these  processes  are  logically  sound,  it  is  not  clear  that  they  are  always 
practical  to  implement  in  a  direct  way.  In  real  world  diagnosis  problems,  the  nature  of  the 
set  of  failures  modes  and/or  implications  of  the  failure  modes  of  a  set  of  components  is  not 
typically  completely  known.  An  experienced  expert  human  diagnostician  will,  in  general, 
hedge  on  the  certainty  of  a  particular  diagnosis  based  only  on  a  given  set  of  measurements2. 
Rather,  they  will  usually  recite  processes  which  will  lead  toward  the  elimination  of  certain 
components  from  suspicion. 

The  technical  interpretation  of  this  difficulty  is  that  behavior  of  the  system  may  be  too 
difficult  to  describe  in  one  unified  set  of  formulae  SD.  The  alternative  is  to  use  a  context 


"  I  hey  have  countless  tails  of  the  “tough  ones”  which  were  only  tracked  down  after  several  unfruitful 
cycles  of  diagnosis  and  component  replacement. 
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sensitive  description  so  that  each  set  of  hypothesized  component  failures  uniquely  specifies 
a  system  behavior  model  which  must  then  be  checked  for  consistency  with  the  observations. 
The  method  described  in  this  paper  tries  to  follow  the  path  of  least  resistance  by  using 
the  type  of  information  which  experts  tend  to  give:  rules  for  eliminating  search  paths. 
It  reasons  about  possible  component  failure  assignments  in  order  to  avoid  the  process  ot 
testing  their  consistency.  Further,  it  is  based  on  a  set  of  abstract  behavior  models  which 
may  be  analyzed  symbolically  to  reduce  the  cost  of  testing  the  consistency  in  some  cases. 

3  Diagnosis  Based  on  Variational  Analysis 

Our  approach  to  diagnosis  differs  from  most  in  that  it  tries  to  perform  diagnosis  on  a 
symbolic  level  rather  than  by  examining  values.  The  problem  is  first  formulated  as  a  set  of 
symbolic  equations  which  may  then  be  specialized  in  terms  of  the  available  measurements, 
and  finally  solved,  again  symbolically,  to  produce  a  bound  on  the  set  of  possible  failures. 

Our  approach  is  similar  to  Reiter’s  in  that  it  finds  a  set  of  explanations  for  the  malfunc¬ 
tion  of  a  system  by  finding  a  set  of  failure  mode  vectors  which  predict  a  system  behavior 
which  is  consistent  with  the  observed  failure.  Clearly,  if  the  failure  models  of  the  compo¬ 
nents  and  the  system  are  correct,  then  this  method  is  sound.  This  approach  can  have  the 
disadvantage  we  have  cited  above  that,  in  principle,  the  system  behavior  must  be  tested 
for  each  allowable  failure  vector.  We  describe  an  interesting  alternative  to  the  enumeration 
process  which  is  based  on  the  relation  method  that  is  used  as  the  primary  search  strategy 
in  the  MONAD  system.  It  attempts  to  prune  the  search  by  manipulating  a  symbolic 
model  of  the  system  under  test  following  which  specific  failure  modes  may  be  explored. 
By  using  approximations  for  the  behavioral  models,  we  may  further  simplify  the  search. 


3.1  Relation  Method 

Prior  to  describing  our  approach  to  diagnosis,  we  will  briefly  review  the  relation  method 
which  is  described  more  fully  in  [Por87]. 

In  principle,  the  relation  method  is  part  of  the  differential  reasoning  engine  which  forms 
the  heart  of  the  MONAD  system.  It  has  the  ability  to  uniformly  find  relations  between 
various  kinds  of  data  objects  within  the  knowledge  base.  The  three  basic  representations 
of  a  relation  between  objects  are  equations,  meaning  a  set,  a  conventional  mathematical 
equation,  or  a  logical  expression;  a  process  which  is  a  procedure  coded  either  directly  or  in 
the  model  language  that  produces  an  equation;  and  the  type  hierarchy  which  is  essentially 
a  semantic  net  that  stores  random  relations  between  objects  within  the  system. 

The  relation  method  knows  how  to  selectively  perform  dependency  analysis  and  dif¬ 
ferentiation  on  any  combination  of  these  representations.  For  example,  figure  1  shows  a 
circuit  consisting  of  some  multiplier  and  adder  devices.  Elements  Mi,  A/2,  an<^  -^3  are 
digital  multiplier  devices  and  A1  and  A2  are  digital  adder  devices.  Figure  2  shows  the 


Figure  1:  Multiplier  Adder  Block  Diagram 


result  of  differentiating  the  output  variables  X  and  Y  with  respect  to  two  of  the  input 
variables  a  and  c. 

We  will  refer  intuitively  and  somewhat  formally  to  these  concepts  in  developing  our 
diagnosis  method. 

X  depends  on  j  and  k 
Y  depends  on  k  and  l 
j  depends  on  a  and  b 
k  depends  on  c  and  d 
l  depends  on  e  and  / 


dX 

da 

,  dX 
and  — — 
dc 


dX  dj 
dj  da 

dX  dk  dYdk 
dk  dc  ^  dk  dc 


(3) 

(4) 


Figure  2:  Multiplier  Adder  Dependency  Results 


4  The  Full  Adder  Example 

Now  we  will  apply  the  variational  principles  of  the  relation  method  to  the  problem  of 
diagnosis.  In  diagnosing  a  system,  we  are  actually  attempting  to  find  a  subspace  of  the  be¬ 
havioral  universe  which  contains  the  instance  behaviors  obtained  from  the  measurements. 
There  are  two  approaches  to  arrive  at  the  component  failures  which  correspond  to  this 
subspace  and  thereby  could  cause  the  observed  behavior.  One  approach  is  to  construct 
the  subspace  containing  the  observations  by  hypothesizing  various  component  failures  and 
testing  to  see  that  the  observations  are  contained  in  the  resulting  subspace.  An  alternate 
is  to  find  a  mapping  from  the  various  possible  subspaces  to  the  component  failures  which 
could  cause  them.  The  first  approach  is  the  one  we  shall  pursue  since  it  has  more  potential 
for  succeeding  on  complex  problems.  The  second  one  corresponds  to  the  approach  used  in 
those  first  generation  of  expert  systems  having  shallow  knowledge  bases. 

To  gain  an  intuitive  feeling  for  this  approach,  we  will  informally  apply  our  approach 
to  a  simple  example  which  was  treated  in  [ReiS5].  We  will  then  provide  a  brief  formal 
description  to  make  these  ideas  concrete. 

Figure  3  shows  a  circuit  of  a  full  adder.  It  is  composed  of  and  gates  A\  and  A2, 
exclusive-or  gates  AT  and  AT,  and  an  or  gate  0\.  The  behavior  of  the  circuit  is  to  add 
together  the  three  binary  inputs  a,  6,  and  c  resulting  in  the  two  binary  outputs  SU M  and 
CAR.  Each  of  these  inputs  and  outputs  meet  the  criterion  a,b,c,  SUM,C AR  £  {0,1}. 
SUM  corresponds  to  the  low  order  bit  of  the  sum  of  the  arguments  and  CAR  corresponds 
to  the  carry  bit  -  the  2’s  bit  -  of  the  sum  of  the  input  arguments.  The  table  below  specifies 


SUM 


CAR 


dSUM 

dDEV 


7^0 


dCAR  /  r> 
dDEV  '  U 


Figure  3:  Full  Adder  Circuit  Diagram 
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the  outputs  whic  esu't  from  any  legal  combination  of  inputs. 


abc 

SUM 

CAR 

000 

0 

0 

001 

1 

0 

010 

1 

0 

Oil 

0 

1 

100 

1 

0 

101 

0 

1 

110 

0 

1 

111 

1 

1 

The  relations  below  describe  the  dependencies  of  this  example: 


{OUTPUTS} 

> 

{SUM,  CAR} 

{SUM} 

» 

{Fx2} 

{Fx2} 

> 

{c,  d,  Ax2} 

{d} 

> 

{FXl} 

{FXl} 

> 

{a,b,  AXi} 

{CAR} 

> 

{F0l} 

{F0l} 

> 

{e;  f  i  Aoj  } 

{e} 

> 

{FAi} 

{FaA 

> 

{a,  b ,  A^j  } 

{/} 

> 

{Fa2} 

{Fa2} 

> 

{c,  d ,  A^2 } 

{a,b,c} 

> 

{INPUTS} 

{A*n  A.y2} 

> 

{XORGATE} 

{ A  \ , ,  A  } 

t> 

{ANDGATE} 

{A  o,} 

> 

{ORGATE} 

{. XORGATE ,  ANDGATE ,  ORGATE} 

> 

{DEV} 

where  a  >  b  means  that  a  depends  on  b  in  some  way.  The  functions  Fqev  correspond  to 
the  behavioral  model  of  the  individual  devices  and  are  used  to  determine  the  details  of  the 
dependency. 

For  the  moment,  imagine  that  we  have  defined  the  notion  of  taking  the  derivative  of 
the  outputs  with  respect  to  the  set  of  all  devices.  We  would  then  be  able  to  write: 

dSUM  dSUM  dFXl  dAXi  dXORGATE 
dDEV  ~  dFx7  d/±x7  dXORGATE  dDEV 

dSUMdFx ,  dd  dFXl  dAXl  dXORGATE 
+  dFx7  dd  dFXl  dAXl  dXORGATE  dDEV 


{XUX2} 


dSUM) 
dDEV J  > 


and 


dCAR 

dDEV 


f dCAR \ 
\  dDEV  / 


dCAR  dF0l  dA0l  dORGATE 
dF0l  dA0l  dORGATE  dDEV 
dCAR  dF0l  de  dFAl  dAAl  dANDGATE 
+  dF0l  de  8FAl  8AAi  dANDGATE  dDEV 
dCARdFo ,  df  dFA a  dAA,  dANDGATE 

+  dF0l  df  dFA a  dAA a  dANDGATE  dDEV 
df  dFAa  dd  8FXl  dAXl  dXORGATE' 

+  dFA3  dd  dFXl  dAXl  dXORGATE  dDEV 

>  {0\,A\,A2,X\} 


where  XORGATE ,  AN DGATE,  and  ORGATE  are  the  sets  of  device  types  and  DEV  = 
{XORGATE,ANDGATE,ORGATE}  is  the  set  of  all  devices.  The  special  symbol  A  is 
used  to  indicate  device  failure. 

We  may  simplify  the  above  equations  by  observing  that,  by  definition,  gp°dey  =  d&cvr"'  = 
§dev  ~  where  the  nodes  are  node  £  {d,  e,  /}  and  the  categories  CAT  £  DEV .  Substi¬ 
tuting  these  values  into  the  above  equations  we  have: 


dSUM  ^  dFX2  dFx2  dFXl 

dDEV  '  dAx2  +  dd  dAXi 

dCAR  =  0FOl  dF0l  dFAx  3F0l  r  dFM  dFM  8FXl 

dDEV  dA0l  +  de  dAM  +  df  [dAM  +  dd  dAXl 


These  two  equations  are  called  the  behavioral  equations  for  the  full  adder  circuit.  The 
first  may  be  interpreted  to  mean  that  the  variation  of  the  SUM  output  is  a  function  of 
the  failure  mode  of  the  X2  exclusive  or  gate,  or  the  failure  mode  of  the  Xx  exclusive  or 
gate  after  it  passes  through  the  X2  gate.  That  is,  or  are  non-zero  if  X2  or  Xx 

are  operating  in  a  failure  mode  respectively.  The  term  is  the  behavior  of  the  output 
of  the  X2  with  respect  to  changes  in  the  d  input.  The  second  equation  is  read  in  a  similar 
fashion  but  there  are  a  few  more  possible  causes  for  an  output  change. 

Given  a  set  of  measurements  on  the  system,  we  wish  to  evaluate  the  above  behavioral 
equations  in  light  of  the  model  of  the  function  of  the  circuit  and  its  elements  to  determine  a 
more  definitive  diagnosis.  By  the  definition  of  normal  operation  of  the  circuit,  q^ddeJv  =  0; 
that  is,  the  variation  of  the  output  of  a  device  is  zero  with  respect  to  the  failure  mode 
variable  of  a  device  if  and  only  if  the  device  is  not  faulty.  This  notion  is  reflected  in  the 
above  equations  in  that  will  be  identically  0  when  all  the  =  0.  Correspond¬ 

ingly,  a  defective  component  has  the  characteristic  that  one  or  more  q^ddl£vv  7^  0.  If  we  can 
define  an  algebra  for  the  types  of  circuit  elements  for  digital  logic  -  that  is,  can  we  define 
an  algebra  for  logic  -  we  can  evaluate  these  equations  symbolically  to  determine  a  solution 
for  the  diagnosis. 


As  an  intuitive  example,  suppose  we  have  made  some  measurements  on  the  circuit 
and  determine  that  both  the  SUM  and  the  CAR  outputs  sometimes  give  bad  results. 
By  examining  the  dependencies,  we  may  ray  that  the  possible  diagnosis  for  this  failure 
is  a  subset  of  the  set  whose  elements  are  the  sets  that  contain  the  union  of  all  pairs  of 
subsets  of  the  two  sets  {Xi,X2}  and  {Oi,  A\,  A2,  Xi).  That  is,  VS  C  {Xi,X2}  ,VT  C 
{Oi,  Ai,  A2,Xi}  ,DIAG  e  Sl)T.  For  example,  {Xa},  {X2,Oi},  {X2,A2},  {X2,Ai}  are 
some  possible  diagnoses.  Note  that  the  last  one  falls  into  the  class  of  solutions  which  may 
not  be  allowable  due  to  the  lack  of  detail  of  the  model.  Therefore,  it  must  be  evaluated  to 
determine  that  it  does  not  apply. 

5  An  Abstract  Algebra  for  Logic 

In  this  section  we  will  define  the  necessary  abstract  algebra  to  perform  diagnosis  on  simple 
logical  systems.  The  basic  requirements  are  to  define  the  standard  logical  operations,  their 
properties,  and  the  meaning  of  dependency  and  differentiation  in  the  abstraction. 

We  define  a  differential  logic  as  any  system  C  on  the  binary  numbexs  {0, 1}  having 
the  closed  binary  operations  +,  •  (typically  omitted),  partial  differentiation  (written  g^), 
and  differentiation  (written  &);  having  the  closed  unary  operation  NOT  (written  ->);  and 
having  the  binary  relation  DEPENDS  (written  as  a  >  b  to  mean  that  a  depends  on  b  and 
as  a  j>  b  to  mean  that  a  does  not  depend  on  b.  For  {a,  b,  c,F]  E  {0, 1),  these  operations 
and  relations  satisfy  the  following: 


0  +  0  =  0 
1  +  1  =  1 


a  +  1 

=  l+o  —  1 

a  +  0 

— -  0  +  0  —  ci 

a  +  a  =  a 

a  +  b 

=  6  +  0 

0 

0 

!i 

0 

1-1  =  1 

a  ■  1 

=  1  •  a  =  a 

a  •  0 

=  0  •  a  =  0 

a  ■  a  =  a 

a  •  b 

=  b  •  a 

-.1 

=  0 

-0 

=  1 

-•(a  •  b) 

=  (-’<*)  +  (-'&) 

-'(a  +  b) 

=  (--a) -(--6) 

a  •  (b  •  c) 

0 

II 

a  •  (b  +  c) 

=  (a  ■  b)  +  (a  ■  c) 
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a>  b>  c 
F  >  {a,b}  >  c 


It  = 

dc 


dc 

da  i  dF  _  dfc 
dc  '  db  dc 


The  functions  Fop(a,  6)  are  the  behavioral  models  for  each  operation  under  differentia¬ 
tion.  Definitions  of  these  functions  under  the  normal  operational  model  of  the  device  are 
shown  below.  For  convenience,  we  include  the  exclusive  or  operation  a®b  =  (a--'b)+(->a-b). 


d(a  •  b) 
da 


d(a  ■  b) 


d(a  +  b) 
da 

d(a  +  b) 
db 

d(-*a) 

da 


d{a  ©  b) 
da 

d(a  ©  b ) 

db~ 


We  may  see  how  these  are  derived  by  considering  an  example.  The  table  below  shows 
the  input/output  relations  for  a  ■  b. 


a  h  a-b  d ^ 

a  0  a  0  Bh 

0  0  0  0  0 


0  0  0 
0  1  0 
1  0  0 
1  1  1 


Note  that,  if  we  hold  b  constant  and  vary  a,  then  if  the  output,  a  •  b  changes,  the  partial 
derivative  is  1.  In  order  to  actually  use  this  idea,  we  must  assure  that  b  remains  constant, 


and  thus  the  terms  and  ->(§f). 


5.1  The  Approach  to  Diagnosis 

To  see  the  utility  of  this  definition,  we  will  continue  exploring  our  full  adder  example 


-'.y.  •• 


Kfl 


above.  Suppose  we  have  made  some  measurements  on  a  system.  How  may  we  use  these 
measurements  to  constrain  our  hypotheses  of  possible  failure  sets  which  can  explain  these 
measurements.  Referring  to  the  table  of  differentiation  above,  we  see  that  a  failed  com¬ 
ponent  may  be  modeled  as  a  function  of  the  input  arguments  in  addition  to  an  argument 
that  selects  the  appropriate  model  for  each  failure  mode.  So  a  device  having  two  inputs 
{a,  6),  one  output  OUT ,  and  one  failure  mode  could  be  modeled  as: 

OUT  =  F(a,b,A) 

where  A  is  a  variable  whose  value  is  zero  if  the  device  is  functioning  normally  and  non¬ 
zero  if  it  is  operating  in  the  failure  mode.  To  reason  about  the  failure  of  the  entire  circuit, 
we  must  try  to  determine  the  combination  of  for  the  various  devices  that  would 

produce  the  failed  behavior  of  the  entire  circuit.  We  may  accomplish  this  in  two  stages. 
First  we  find  the  variation  of  the  outputs  as  a  result  of  varying  the  A’s  -  that  is  we 
derive  the  general  behavioral  equations  for  the  circuit.  Secondly,  we  hypothesize  various 
failure  modes  to  determine  if  the  exact  behavior  can  be  explained.  Often  we  will  find  that 
performing  the  second  step  using  a  very  simple  model  of  the  failure  modes  of  the  devices  is 
sufficient  to  determine  the  circuit  fault  or  faults  (we  note  here  that  this  approach  handles 
multiple  faults  identically  as  it  handles  single  ones).  However,  in  complex  multiple  fault 
problems,  we  may  have  to  resort  to  repeating  the  second  step  with  a  more  complex  failure 
model  for  the  components.  Although  the  repetition  of  the  second  step  can  be  combined 
by  using  a  complex  model  the  first  time,  we  purposely  separate  them  since  each  iteration 
reduces  the  search  space  for  the  next.  In  general,  the  second  step  may  be  repeated  using 
more  complex  models  until  a  unique  and  consistent  diagnosis  is  obtained. 

The  first  step  is  accomplished  in  keeping  with  the  style  of  automatic  problem  formu¬ 
lation  as  embodied  in  the  MONAD  system.  The  circuit  is  modeled  as  a  set  of  intercon¬ 
nected  components.  The  interconnection  graph  determines  the  dependency  of  the  various 
nodes  on  the  devices.  The  behavioral  model  is  derived  by  performing  variational  analysis 
of  the  system  by  taking  the  derivative  of  each  output  with  respect  to  the  set  of  component 
A’s3.  Each  derivative  will  result  in  an  equation  which,  collected  over  all  outputs,  form  the 
behavioral  model  for  the  system. 

The  second  step  is  accomplished  by  examining  the  expected  output  with  respect  to 
measurements  on  the  system.  Substituting  these  measured  values  into  the  behavioral 
equations  yields  a  set  of  inequalities  based  on  the  difference  in  the  observed  and  expected 
behavior.  More  specifically,  if  we  take  the  derivative  of  output  X  with  respect  to  the  set 
of  devices,  the  result  is  an  equation.  It  will  be  a  sum  of  terms  composed  of  the  product 
of  system  dependent  variables  and  partial  derivatives  of  the  form  iFP$v 
For  a  given  observation,  the  difference  in  the  expected  output  and  the  observed  output 
is  set  equal  to  the  derivative  of  the  output  with  respect  to  the  devices  since  the  devices 


’For  efficiency,  of  course  one  would  use  only  the  relevant  outputs  and  the  relevant  components. 
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are  assumed  to  be  the  cause  of  the  difference.  The  set  of  such  equations  may  be  solved4 
symbolically  to  give  information  about  the  possible  explanation  of  the  observed  system 
behavior.  If  the  set  is  definitive,  the  process  may  be  abandoned;  otherwise,  substitutions  for 
specific  failure  models,  |— may  be  used  to  find  a  solution  which  predicts  the  observed 
behavior. 

For  example,  suppose  we  believe  that  the  outputs  of  each  gate  in  the  full  adder  circuit 
can  either  behave  normally,  malfunction  all  of  the  time,  be  stuck  at  0,  or  be  stuck  at  1. 
For  each  operation  above,  the  output  is  then  modeled  as  the  function  Fdev{&,  b,  A),  where 
A  is  the  variable  which  mew?  tfn  output  from  normal  brhavijr  toward  that  of  the  failure 
mode.  For  our  full  adder  example,  a  suitable  set  of  failure  models  are  shown  below5: 


Normal 

Undefined 

Stuck  at 

Stuck  at 

Function 

Behavior 

Model 

Zero 

One 

F(a,  b) 

dF 

da 

dF 

da 

dF 

9A 

dF 

9A 

dF 

9A 

a  •  b 

& -(g) 

& -(g) 

V 

(a  •  b)  ■  V 

-i(a  •  b)  ■  V 

a  -\-b 

”6 -(g) 

-*-(g) 

V 

(a  +  b)  ■  V 

-.(a  +  b)  ■  V 

->  a 

1 

1 

V 

-i  a  ■  V 

a- V 

a@b 

-(g) 

-(f) 

V 

(a  0  h)  ■  V 

-■(a  ©  6)  ■  V 

Note  that  for  stuck  at  0  and  stuck  at  i,  ~  —  ^  =  0.  We  have  used  the  variable  V  =  1  to 
indicate  that  a  given  failure  mode  for  a  device  is  selected.  This  variable  is  generally  used 
in  the  corresponding  equation  for 

Given  a  set  of  models  m,  such  as  those  above  for  the  failure  modes  of  the  device,  then 
a  total  model  Mtotal  may  be  composed  of  several  failure  modes  at  once  using  the  relation: 

M total  =  ^ DEV  (5) 

n 

where  the  V^y6  select  the  possible  modes  of  failure  for  the  device  as  appropriate.  Mutual 
exclusion  of  models  such  as  stuck  at  0  and  stuck  at  1  must  be  accounted  in  the  allowable  V 
vectors  which  define  the  search  space  for  the  solution.  If  we  choose  no  model  for  the  failure 
-  that  is,  the  output  is  always  wrong  -  we  may  not  be  able  to  discern  when  a  measurement 
can  appear  correct  even  though  a  device  is  faulty. 


4Note,  by  solve  we  mean  perform  a  solution  process  which  gives  information  about  the  allowable  constraint 
variables.  No  solution  is  an  allowable  return  from  this  process. 

5They  are  symmetrical  about  (a,  6). 

6 Note  that  the  superscript  does  not  indicate  exponentiation  but  simply  the  vector  element. 
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5.2  Diagnostic  Equations 

As  we  have  said,  in  order  to  form  a  diagnosis,  we  will  define  a  set  of  inequalities  which 
constrain  the  possible  search  space  and  provide  a  decision  function  for  the  diagnosis.  If  all 
the  devices  are  functioning  normally,  then  the  derivatives  J^*-  =  0  for  all  outputs  F,  and 
device  failure  vectors  V?.  If  one  or  more  devices  malfunction,  then,  zeio  or  more  of  the 
derivatives  may  become  non-zero.  In  some  cases  the  system  inputs  or  the  functional 
dependencies  of  the  outputs  may  cause  the  outputs  to  still  remain  as  expected  under  the 
normal  behavioral  conditions  due  to  a  poor  choice  of  device  failure  models.  So  in  general, 
we  may  form  two  types  of  constraint  equations  based  on  the  observed  measurements.  If 
the  measurements  agree  with  the  expected  behavior  given  the  system  inputs  and  state, 
then  the  system  may  be  fault  free.  If  the  measurements  do  not  agree,  then  they  can  be 
used  to  form  an  upper  bound  on  the  set  of  components  which  may  be  causing  the  failure 
if  we  know  the  dependencies. 

Returning  to  our  full  adder  example,  suppose  we  choose  the  “unknown  model”  for  the 
component  behavior.  That  is,  if  a  device  fails,  its  output  is  always  incorrect.  Given  the 
system  behavioral  model  v  lich  we  repeat  below: 


dSUM 


and 


dDEV 

dCAR 


dFx2  ^  dFx^dFx, 


dAx2  dd  dAx! 


dDEV 


dF0l  1  dFQx  dFM  dFDl 

*  r\  a  l” 


dA0i  de  d AAl  df 


dF a2  .  0Fa2  dFxx 


0Aa2  dd  dAxx 


We  first  compute  the  partial  derivatives  for  the  devices  with  respect  to  the  set  of  all 
devices  (DEV)  as  shown  in  the  table  below: 


dFx2 


dd 

dc 


dDEV 

dFX2 


dd 


dF 


M 


dd 

dc 


dDEV 

0Fa2 


dd 


dc 


dDEV 


dc 

dFGl 

- 

" dDEV 

de 

=  0 

df 

= 

dDEV 

=  1 

dr 

-  nf  -V 

df 


dDEV 


M7,, 


0F( 


Oy 


dc  \ 


df 

Oe 


—  -ie 


)DEV  ' 


dDEV 

dF0x 


V, 


df 


m  ■  -V 
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Note  that  since  {a,b,c}  are  held  constant  at  the  observation  values,  the  independent  vari¬ 
ables  are  only  V DEV  €  DEV.  These  derivatives  may  be  substituted  into  the  behavioral 
equations  giving  the  operational  equations  for  the  full  adder: 


dSUM  „  ,  „ 

dDEV  ~  x2+  x, 

and  =  V0i+./.,Vv.(c.vJ.V,+..^,K+c.V, 

Now,  suppose  we  have  made  the  following  measurements  on  the  full  adder  circuit: 

Observed  Expected 

Meas  urement  a  b  c  d  e  f  SUM  CAR  SUM  CAR  f/npn/  fntw 

Obx  1  0  1  1  0  1  10  0  11  1 

Ob2  1001000  0  1  0  1  0 

From  the  columns  containing  and  we  see  that  the  first  measurement  can  be 


used  to  construct  two  constraint  equations  of  the  form: 


n  a.  dsuM 

U  '  dDEV  ohl 


=  V  X  „  +  V ; 


and  0  ^ 


dCAR 

dDEV 


V0l  +  -•/  •  •  -1  (c  •  VXl)  •  V, 

+  ^--vAi-(v42  +  c-vXj)| 


To  evaluate  the  measurement  Obu  we  substitute  the  values  c  —  1,  ie  -  1,  and  ->/  =  0 
for  the  measurement  into  the  system  behavioral  equations  giving: 


0  ± 


and  0  ^ 


dSUM 
dDEV  Qbi 


dCAR 
dDEV  obl 


—  VXo  +  V; 


=  +  V*i 


We  may  evaluate  the  operational  equations  for  the  second  measurement  Ob2  in  a  similar 
fashion.  The  result  is  the  third  equation  below  since  the  equation  for  Q  is  a  dupli¬ 
cate  0f  .  Thus,  the  following  three  equations  form  a  set  of  diagnostic  constraint 

dUhjV  *  * 

equations  which  we  can  solve  in  order  to  deduce  information  about  the  diagnosis. 


0  * 


and  0  ^ 


dSUM 
dDEV  obl 


dCAR 
dDEV  obl 


—  V  Xo  +  V  ; 


—  V0l  +  ‘  ^a2  +  VA, 


and  0  = 


dCAR 
dDEV  oh2 


—  v0i  +  -v*  •  v4]  +  -v4i  •  V 


Ai  v  A  2 
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Notice  that  these  equations  could  be  inconsistent  since  our  failure  model  could  be  incorrect. 
This  directly  addresses  the  problem  cited  earlier  regarding  the  completeness  of  the  model. 
We  are  not  requiring  a  complete  or  correct  model  to  formulate  the  problem.  Rather,  we 
have  formulated  the  problem  at  a  higher  level:  the  behavioral  equations  “which  are  the 
differential  equations  for  the  variation  of  the  measurements  -  which  is  independent  of  the 
failure  models  Tin  *  equations  aa-  the  problem  formulation  of  this  particular  frill  adder 
problem.  Clearly,  the  system  behavioral  equations  may  be  easily  evaluated  to  obtain  other 
particular  solutions.  The  important  point  is  that  these  equations  are  formulated  in  such 
a  way  that  one  of  the  embedded  subsystems  in  the  MONAD  system  -  the  constraint 
subsystem  -  can  attempt  a  solution  without  interference  from  the  remainder  of  the  system. 
If  no  solution  is  found,  the  task  of  reformulation  is  not  handled  within  the  constraint 
subsystem  but  rather  by  the  current  process  driving  the  relation  method. 

Returning  to  the  full  adder  diagnosis,  a  solution  set  may  be  obtained  by  assuming  that 
these  equations  hold.  That  is,  by  assuming  that  our  failure  model  for  the  devices  is  correct, 
we  can  conclude  that  V0j  =  -iV^  •  =  -iV^  •  =  0,  thereby  reducing  the  second 

to  u  /  •  VXi  ^  VXl  -  I,  ?Al  =  0.  iVom  •  VA2  =  U  w*  gtt  =  0, 

and  thus,  the  diagnostic  interpretation  of  these  equations  is  that  device  { ACj }  is  defective, 
{ Oi ,  A\ ,  A 2 }  are  good,  and  device  {A^}  is  unknown  under  this  failure  model. 

6  Applying  Specific  Failure  Models 

The  notion  of  refining  the  diagnosis  bound  may  be  seen  more  clearly  by  an  example.  We 
will  show  how  to  substitute  specific  failure  models  into  the  behavioral  equations  in  order 
to  test  a  hypothesis.  As  we  have  already  said,  we  are  at  liberty  to  apply  a  complex  failure 
model  from  the  outset.  But  a  refinement  approach  can  be  useful  when  the  problem  is  too 
large  for  this  to  be  practiced. 

In  the  previous  analysis,  we  have  first  substituted  the  broadest  model  -  the  unknown 
model  -  which  has  the  simplifying  characteristic  that  g^DDEEvv  =  1  •  V  for  all  devices.  Now 
we  will  create  a  failure  model  in  which  the  specific  failure  mode  stuck  at  0  is  tested.  As 
before,  Vmorfe  is  the  mode  selector  for  each  operating  mode  of  the  device  such  that  the 
device  assumes  that  mode  when  V  ^0.  Typically,  we  will  assume  that  failure  modes  for 
2  device  are  mutually  exclusive,  that  is  0  =  V,  ■  ->V  ■  and  0  ±  Yli  V,-.  We  will  define 

<v,e  mode  variables  for  each  device:  and  V where  the  first  selects  normal 

be;  ivior  and  the  second  selects  the  output  stuck  at  0  mode. 

five  cell  that  the  behavioral  equations  for  the  lull  adder  arc 


dSUM 

dDEV 

,  dCAR 
ald  dDEV 


dFx,  0Fx2  dFXl 

dAX2  dd  dAXl 

dFpx  dFp1  dFAl  dFp^  dFA2 

0AOl  de  dAAl  df  dAA2 


dF\  dFxi 
dd  dAXl 
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F(a,b) 

Normal 

dF 

da 

Stuck  at  Zero 
dF 

9A 

a  •  b 

(a  -  b)- 

a  +  b 

(-»  •  -(f))  •  vggr 

(a  +  6)  •  V 

-i  a 

thNORM 
v  D£JV 

(-ta)  ■ 

a  ©  b 

db\  ^-7 NORM 
^  \dZ)  '  VDEV 

(a  ®  fe)  • 

Substituting  into  the  behavioral  equations  as  before: 


dSUM 

dDEV 

dCAR 

dDEV 


dc 


jSAZ 


=  (dec).  Vft*  +{-fd)'  M  •  (a  ©  b)  • 

=  (e  +  /)  •  +  (-/)  •  •  Vgf™  •  (a  •  6)  • 


SAZ 


.  .  ,  de 

+(^e)  •  ^777 


/  ^e\  t-tNORM  fj  „\  V7 SAZ,(\  (  ®c\  r-jNORM  /  m 

l-1' df)'V°'  '  +  ^  '  V’ddJ  ’  Vj4a  •(a©&)>VXl 


where  Vf4;fr  =  .  Forming  the  partial  derivatives  with  respect  to  all  the  devices 

as  before: 


which  we  may  substitute  into  the 


=  V^.(-1C+V^M) 

_  yjNORM 
-  V.4i 

=  1 

operational  equations  above  giving: 


dSUM 

dDEV 

dCAR 

dDEV 


(d©c)  ■  v£dz 


+  V 


NORM 

X-, 


(a  0  6)  • 


(e  +  /)  •  Vgf z  +  b/)  •  ■  (->c  +  V™™)  • 

+(-*)  ■  V™RM  ■  V™RM  ■  l(d  ■  c)  •  Y™z  +  (c)  •  ■  ( 


(a-6)V^z 
a  0  6)  •  V*f] 


These  operational  equations  may  be  evaluated  at  each  of  th<=  measurements  to  obtain 
the  diagnostic  constraint  equations  as  before: 
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lX!tf liKAWUiJUflUt TVS U \a TJi ?> "IV*  VUi X  W.> VW  -%  •  ■  .<■  ■  V  /•.  1-.BHF  «  n«r. 


n  /  dCAR  _  0  ,yNORM 

U  T  dDEV  obi  U  T  V  M 


1  .  [V£*  +  V"OHM  •  1] 


0  ± 


am  =  o  +  i-i 

j  r\  r»t/  _ .  v  I  *  * 


dDEV  lob? 


From  the  second  equation,  we  conclude  that  V£onM  =  1  =>  V?"  =  0,  reducing  the 
second  equation  to: 


0  ±  V 


SAZ  ,  'ey  NORM 

T  V  ji. 


which  simply  means  that  we  have  not  tested  A2.  Thus,  the  diagnosis  is  that  {Xa}  is  stuck 
at  0 ,  {A2}  is  unknown,  and  {0\,  A\,  X2]  are  all  good  under  this  fault  model. 


7  An  Abstract  Example 


To  show  the  diversity  of  this  technique,  we  now  turn  to  an  abstract  example.  Figure  4 
shows  a  diagram  of  the  multiplier- adder  circuit  shown  previously  in  figure  1.  One  set  of 
observations  of  the  measured  behavior  are  shown  as  values  on  the  various  nodes.  First  we 
would  like  to  consider  the  problem  of  making  various  deductions  about  the  functionality 
of  this  circuit  without  actually  considering  its  detail.  Figure  5  shows  a  diagram  of  the 
circuit  device  classification.  The  table  below  shows  the  corresponding  dependencies  for 
the  devices: 

m  >  {fAi} 

{Fa,}  »  {j,fc,A4l} 

{;}  >  {Fm,} 

{FMi}  >  {a,  6,  AMl} 

{k}  >  { Fm2 } 

{Fm2}  >  (c,  d,  Am2} 

{Y}  >  { Fa2 } 

{. Fa2 }  >  {M,A^2} 

{/}  >  {Fm3} 

{  F m3  }  { c ,  /,  A  m3  } 


Figure  4:  Multiplier  Adder  Observation  Set 


{Amh  Am2,  Aa/3}  t>  { MPy } 

{A^,AA2}  >  {ADD} 

{MPY,ADD}  >  {DEV} 

The  corresponding  behavioral  model  is  derived  as: 

dX  dX  \dFAl  dk  dFM a  dAM,  dMPY 

dDEV  ~  dFAl  dk  dF\f3  dAM,  dMPY  dDEV 

8FAt  6j  8Fm ,  dAMl  dMPY  8FAl  8AAl  dADLY 
+  dj  8FMl  dAMl  dMPY  dDEV  dAAl  dADD  dDEV . 

and 

dY  _  dY  r dFA ,  at  aAM,  dMpy 

dl'EV  ~  dFAa[  dk  dFM,dAM,  dMPY  dDEV 

dFA ,  dl  dFMa  8Am3  dMPY  dFAl  dAAl  dADD 
+  dl  dFM3  8Am3  dMPY  dDEV  +  dAA a  dADD  dDEV 


By  definition  of  a  connection: 

9X  dY 
1  "  dFAl  ~  dFA2 

dj  _  dk  _  dl 
dFMl  9Fm2  9Fm3 
_  £A Mi  _  9Am2  _  9AM3 

9MPY  ~  dMPY  ~  dMPY 
9FAl  9Fm  dMPY  dADD 

1  -  9 ADD  ~  9ADD  ~  dDEV  ~  dDEV 

Substituting  these  into  the  above  we  get: 

dX  9FM  9FM2  9Fm  9FMi  9FAl 

dDEV  dk  9AM2  +  dj  9AMi  9AAl 

dY  9FA2  9Fm2  9FA2  dFm  9FM 

fan  dDEV  dk  9AM,  +  dl  9AM3  +  9Aa, 


7.1  Diagnosis  by  Dependency 

From  simply  examining  the  device  dependencies  we  may  make  conclusions  similar  to  those 
made  about  the  full  adder.  For  example,  if  our  information  is  that  the  X  output  seems  to 


produce  the  wrong  answer  sometimes  while  the  Y  output  is  always  correct,  then  we  may 
write: 

n  /  _dx_  -  9F*  9Fm*  i  dF 'A'  9Fm i  I  ^ 

U  r  dLDEV  ~  9k  9AM2  ~  dj  9AM] 

.  „  wv  9Fa „  3FM,  ,  MU,  ,  8JU, 


_  _  8Fa,  8Fm,  |  MU,  9Fm ,  9FAt 

and  U  —  dDj£y  —  9k  9Am2  91  9Am3  '  9A,42 


From  the  second  equation,  we  can  hypothesize  that 


0  which 


means  that  {M2,M3,A2}  are  all  good.  This  reduces  the  first  equation  to  0  ^  Jfev 


9 Fa,  9Fm,  ,  9Fa. 


By  using  the  simple  failure  model  that  a  device  failure  is  always  observ 
=  1,  then  we  have  0  ^  =  Jgj-  +  which  means  that  devices  either  or 


dj  9 AMl  t  — 

able:  j =  1,  then  we  have  0  ^ 

both  of  {Mi,  Ax}  could  be  faulty. 


7.2  Diagnosis  Using  an  Abstract  Model 

Suppose  we  wish  to  decide  which  cievice  is  faulty  but  we  have  no  detailed  information 
about  the  functional  behavior  of  the  multipliers  and  adders  other  than  the  mathematical 
notions  of  their  purpose.  So  for  example,  the  multiplier  circuit  might  be  modeled  as: 

j  =  FMi(a,b,A  a/,)  =  ab 

k  =  Fm2(c,  d,  Am2)  =  c-d 

l  —  F \f3 ( e ,  f ,  Am3)  &  '  f 

X  =  Am)  =  j  +  k 

Y  =  FA2(k,l,AM)  =  k  +  l 

where,  in  this  abstract  model,  +  means  addition,  and  •  means  multiplication  in  the  con¬ 
ventional  sense.  Repeating  the  simplified  behavioral  equations: 


dX  dFAl  dFM2  dFM  dFMi  |  dFAl 

dDEV  ~  dk  dAM2  +  dj  dAMl  dAM 

dY  _  dFM  dFM2  dFM  dFM3  dFM 

311,1  dDEV  ~  dk  dAM2  +  dl  dAMs  dAM 

Taking  the  device  model  derivatives  for  normal  behavior  from  the  abstract  devir  e  mod 
els  we  find: 

9Fa ,  9(j-\-k)  __  i 

9j  9j 

9FAi  _  9(j+k)  _  i 

dk  9k 

9Fa,  _  9(k+l)  1 

9k  ~  9k 

9Fa7  _  9(k+l)  _  1 

91  ~  91 


Substituting  these  values  into  the  above  equations  we  have: 


and 


dX  _  dFM2  dFMl  dFAl 

dDEV  dAM2  +  ~dAMl  +  dAM 

dV  =  9FM2  dFM 3  dFM 

dDEV  dAM2  +  dAMi  +  dAM 


In  order  to  perform  diagnosis,  we  might  specify  a  failure  model  based  on  the  notion 
of  the  failure  of  a  particular  bit  in  either  the  input  circuitry  or  the  output  circuitry.  For 
example,  to  specify  that  bit  i  of  the  output  ( OUTi )  has  failed  we  will  assume  that  q2^ut  ^ 
0.  The  corresponding  failure  model  might  be: 


Normal 

Bit  i  Stuck  at  0 

Bit  i  Stuck  at  1 

dF 

dF 

dF 

Device 

da 

9A 

3A 

MPY 

b 

DEV  (OUTi)  ■  (-21)  •  V  DEV 

DEV(^OUT,)  ■  (2‘)  •  VDEV 

ADD 

1 

DEV(OUTi)-(-2')-VDEV 

DEVVOUTi )  ■  (2‘)  •  VDEV 

where  DEV  (OUTi)  is  a  predicate  indicating  that  output  bit  i  of  OUT  of  device  DEV  is 
set  to  a  logical  1.  Substituting  these  into  the  above  equations  we  have: 


dDEV 


Mx(OUTx)  •  (— 21)  •  VMi  +  M2(OUTy)  •  (-21)  •  VM,  +  Ax{OUTx)  •  (-21)  •  Vyll 
M2(OUT\)  •  (— 21)  ■  VM,  +  M3(OUT i)  •  (— 21)  •  VMj  +  A2(OUTx)  ■  (-21)  •  V^2 


Suppose  that  we  take  two  measurements  as  in  the  fall  adder  example: 

Observed  Expected 


Measurement 

a 

b 

c 

d 

e 

f 

X 

Y 

X 

Y 

j 

k 

1 

Obx 

3 

2 

3 

2 

3 

2 

10 

12 

12 

12 

6 

6 

6 

Ob2 

2 

2 

3 

2 

2 

2 

10 

10 

10 

10 

4 

6 

4 

Then  we  may  form  the  following  inequalities: 


-2  = 
0  = 
0  = 
0  = 


dX 

dDEV 

dY 

dDEV 

dX 

dDEV 

dY 

dDEV 


Ob\ 


Ob i 


Ob2 


Ob2 


1  •  (—2)  •  VMl  +  1  •  (—2)  •  VMz 

M2(OUT i) .  (-2)  •  VM2  +  M3(OUT,)  ■  (-2)  •  VM3 

1  •  (-2)  •  VMj  +  1  •  (-2)  ■  V„, 

M,(OUT ,)  ■  (-2)  •  V„2  +  A2(OUT1)  ■  (-2)  •  V,,, 


which  are  easily  solved  to  give  VMl  =  1  and  all  others  being  0  which  means  that  Mx  having 
a  stuck  at  0  bit  1  could  explain  the  measurements. 
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9  Conclusions 

have  described  a  formal  approach  to  problem  formulation  for  the  diagnosis  of  static, 
loop-free,  digital  logic  circuits.  The  approach  may  be  extended  to  the  modeling  of  formal 
logic  by  their  differential  behavior  for  such  tasks  as  the  control  of  reasoning  and  knowledge 
state  revision.  The  approach  is  easily  extended  to  quasi-static  systems  by  defining  internal 
state  for  the  models.  Future  reports  on  the  current  effort  to  formalize  reasoning  about 
dynamic  processes  will  directly  deal  with  time  varying  systems  with  loops. 

The  important  point  of  this  paper  is  that  the  approach  is  uniform  in  its  use  of  the 
relation  method  of  the  MONAD  system  in  which  it  is  also  used  for  more  complex  tasks 
such  as  reasoning  by  analogy.  In  these  tasks,  it  is  essential  that  various  kinds  of  objects  be 
treated  uniformly  by  the  reasoning  processes  so  that  the  control  of  the  reasoning  process 
may  be  abstracted.  Our  work  in  the  near  future  will  address  these  issues. 
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